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Contaminated food for thought 


Ifitis to deal effectively with outbreaks of infectious diseases, Germany must streamline its 


convoluted systems for reporting and communication. 


were reported, diners in Germany are still contemplating their 

side salads nervously, spooked by the confused information and 
warnings that have been issued over the past few weeks. Which item 
of greenery might be home to the deadly Escherichia coli bacterium 
known as EHEC O104:H4? By 13 June, the microbe had infected 3,325 
people and killed 36. 

The German public has been traumatized. It took weeks for the 
probable source of the bacterium to be named as an organic-bean- 
sprout farm in Lower Saxony. And, inevitably, accusations of crisis 
mismanagement are starting to fly. 

These critical fingers, rightly, are not pointed at the scientists in 
Germany (and elsewhere), who rose admirably to the challenge of 
identifying and analysing the culprit. Instead, they are directed, with 
some justification, at the bizarrely complicated system Germany uses 
to handle disease outbreaks and track their sources — and at an alarm- 
ingly outdated way of transmitting information between physicians 
and agencies. 

Ultimately responsible for disease control and prevention is the 
Robert Koch Institute in Berlin. However, Germany's federalized 
structure means that the institute receives its information indirectly, 
through many tiers of hierarchy. 

The clinical laboratories that investigate samples sent to them by 
physicians and hospitals must promptly report notifiable diseases to 
their district health office, of which Germany has more than 400. Each 
of these offices passes the information on to its respective state ministry, 
which then transfers it to the federal health ministry, which then passes 
it onto its Robert Koch Institute. Days can elapse at transfer points and, 
scarcely credible in 2011, some of this information is still sent by post. 

There is more. Responsibility to track the source of food-borne 
infections lies not with the Robert Koch Institute, but with the Fed- 
eral Institute for Risk Assessment, part of the Ministry of Food, Agri- 
culture and Consumer Protection. So, together there are two federal 
ministries, two federal technical institutes and 16 state ministries that 
can each pronounce on progress. Inevitably, confusion emerges — as 
demonstrated by the rushed and false fingering of Spanish cucum- 
bers as the source late last month by Hamburg’s state health minister, 
Cornelia Priifer-Storcks. 

Two things need to be done. First, Germany must eliminate the 
information-transfer chain and introduce a centralized electronic 
database that district health offices feed information into directly. 
Ideally, this would be supplemented by mandatory electronic report- 
ing of individual cases by physicians. The US Centers for Disease Con- 
trol and Prevention in Atlanta, Georgia, operates such a system, and 
the idea was discussed in Germany after the 2009 swine-flu pandemic. 
However, the proposal lost political support because it threatens the 
autonomy of the states. 

This takes some explaining. Germany's post-war constitution was 


. ome six weeks after the first cases of potential food poisoning 


designed to keep centralization to a minimum, and many responsi- 
bilities, including health, were devolved to the states. Introduced to 
prevent another dictator like Hitler, this principle is hard to attack. But 
it was never intended to hinder Germany from controlling politically 
illiterate microbes with no respect for state borders. Clearly, a way must 
be found to make an exception to the devolved-responsibility rule, at 

least when it comes to infectious diseases. 


“A way must be The Robert Koch Institute, which has proven 
found to make itself extremely competent in handling its part 
an exception to of the E. coli crisis given the blocks put in its 
the devolved- way, needs much more power. Second, when 
responsibility disease threatens, Germany needs to be able 
rule.” to speak to its people with one voice — no 


matter how many authorities are involved in 
the process. This should be the Robert Koch Institute. 

EHEC 0104:H4 has proven to bea particularly evil enemy. Current 
agricultural practices are likely to generate other microbes of equal 
virulence or worse, and these will inevitably spread as people travel. 
Authorities in Germany and elsewhere must be able to keep control. m 


Full transparency 


Nations should release global nuclear- 
monitoring data to academics and the public. 


nuclear-weapons tests, scientists have built a system that can 

detect an illicit explosion anywhere in the world. The moni- 
toring network stretches from Antarctica to Siberia and captures a 
wealth of useful data — not just on infrequent atomic bangs, but also 
on other types of explosion, earthquakes, underwater shocks and 
radiation releases. 

Yet access to these data is restricted to contributing governments and 
selected allied scientists, who are largely prevented from sharing the 
information with the public. The diplomatic excuses offered for this 
unwise and unnecessary secrecy no longer wash, particularly in light of 
the March meltdowns at the Fukushima Daiichi nuclear power plant. At 
a meeting in Vienna next week, scientists who used these data to inform 
their governments about the scale and dangers of the Fukushima acci- 
dent, but who saw the results kept under wraps, will push for change. 

Their move deserves support. Data from the network, run by the 
Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO), 
should be freely available to scientists everywhere, for study in their 
own right and to inform the public in times of crisis. Governments 


[ nder the auspices of a proposed international ban on all 
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may be nervous about such openness, but the benefits far outweigh 
the risks. 

The CTBTO has proved its worth in recent years. It detected North 
Korean nuclear tests in 2006 and 2009, and has captured detailed 
seismic data on major earthquakes, including the 2004 Sumatra- 
Andaman event that sparked a devastating tsunami. 

This spring, the organization's 80 radioisotope-monitoring stations 
offered the clearest global picture of low-level fallout released from 
the Fukushima plant. Government-accredited scientific institutions 
were given access to provide politicians with valuable information 
about how the radiation was spreading and whether it posed a national 
threat. But most were told not to talk about the results in public, or to 
share the data with others in academia. The reason was diplomatic: 
governments such as the United States did not want to embarrass the 
Japanese, nor pre-empt their announcements about events unfolding 
at Fukushima Daiichi. 

More generally, governments worry that radioisotope data are too 
sensitive to share. Politicians fear that, should a nuclear test occur, full 
access to incriminating data could somehow allow the offending nation 
to contest charges of weapons testing. Or perhaps that others could 
glean sensitive nuclear secrets from the isotopes in the atmosphere. 

These fuzzy fears must be weighed against the impact of the infor- 
mation vacuum that followed Fukushima. Scientists everywhere were 
asked to give assessments, yet few had access to data that would allow 
them to do so. Providing open access to the CTBTO’s network would 


have given experts the information they needed to make important 
statements about Japan’s reactors and the threats these posed to Tokyo 
and beyond. The data would also have lent credibility to the Japanese 

government's own statements on radiation levels in the region. 
Moreover, such data are scientifically useful in their own right. 
Atmospheric scientists use radioisotopes widely and the CTBTO 
network is gathering a unique data set that 


“Scientists were eer be used to improve climate models or 

: to refine meteorological studies. Scientists 
posite of with access to the da might also find some 
the fall lout from new use for them. Thus far, nations have paid 
the Fukushima a combined US$1 billion for the network, 

Roni ak and they might as well put it to good use. 

plant, yet few The network has already taken tentative 
had access to steps towards openness. Following the 2004 
data that would tsunami, member states agreed that its seis- 
nid them to mic and hydroacoustic data could be used by 


accredited tsunami-warning centres around 
the world. In the immediate aftermath of the 
Fukushima Daiichi accident, it was allowed to share data with the 
International Atomic Energy Agency. 

These are positive developments, but nations should go further: the 
CTBTO data are valuable in times of both calm and crisis. Contrary 
to the concerns of some, the more people who see them, the more 
valuable they will become. m 


Great ape debate 


Researchers should contribute to a US analysis 
of the case for chimpanzee research. 


indisputable. It was important in developing the Sabin polio 

vaccine; instrumental in discovering the infectious nature of 
the spongiform encephalopathies; and essential to both the creation 
of a vaccine against hepatitis B and the identification, in 1989, of the 
hepatitis C virus (HCV). 

Humankind has benefited handsomely. Since the United States 
instituted universal childhood vaccination for hepatitis B in 1991, 
there has been a 98% decline in the disease in children under the age 
of 15 years. And with the identification of HCV, screening of donated 
blood for the virus reduced the risk of transfusion-associated hepatitis 
in the United States from 4% in 1989 to almost zero in 2000. 

Today, chimpanzee research is still bearing fruit, especially for 
hepatitis C, a disease that infects at least 170 million people globally 
and often results in permanent liver damage or cancer. No approved 
vaccine yet exists. A study published in 2002 put the annual economic 
costs of the disease in the United States at more than US$750 million. 

The chimpanzee is the only animal model in which human strains of 
HCV can replicate, making it especially important in work to develop 
a vaccine. And studies in this animal have propelled at least one hepati- 
tis C vaccine into human trials. Other chimpanzee experiments are mak- 
ing inroads in developing better therapies for the disease. The case for 
chimpanzee use in some other circumstances — such as the effort to 
develop a vaccine against respiratory syncytial virus, which mainly affects 
infants and young children — is less strong, but is at least arguable. 

But chimpanzee studies are under fire (see page 268). Public 
discomfort over the use of chimpanzees in research has reached a 
historic high, with the result that the United States is now the only 
country save Gabon in which invasive experiments are conducted. 
Legislation has now been introduced in the US Congress that would 
prohibit invasive chimpanzee research. Although the bill is unlikely 


r | Vhe historical value of the chimpanzee as a disease model is 
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to become law any time soon in a Congress distracted by wars, debt 
anda moribund economy, the Great Ape Protection and Cost Savings 
Act is nonetheless a sign of the times. 

So, too, is the fact that the National Institutes of Health (NIH), facing 
public pressure after proposing to return nearly 200 semi-retired chimps 
to active research, has commissioned a study by an Institute of Medicine 
(IOM) committee, which convened last month. The committee's task, 
to culminate in a report planned for the end of the year, is to determine 
whether chimpanzee studies are necessary to answer current and future 
biomedical and behavioural research questions, or for drug and vaccine 
testing — and, if so, why. 

The purview of the task that the NIH has set the IOM is troubling. It 
contains no mention of ethical aspects of the research, and the NIH has 
publicly stated that this omission was deliberate. Of the 12 current mem- 
bers of the committee, just one is a bioethicist. The agency may wish to 
divorce the science from the ethics, but society at large will not accept 
such a distinction. Nor is it intellectually defensible: a moral choice to 
use intelligent, emotionally complex creatures to their detriment, for the 
benefit of human welfare, is intimately related to what can be achieved 
scientifically. It would be wrong for the NIH to make any change in 
its support for chimpanzee research — or indeed to maintain the 
status quo — solely on the basis of the scientific report from the IOM. 

Still, the work of the committee will provide a valuable starting point 
by defining the scientific case for chimpanzee research. Working from 
this, ethicists, the public, the animal-protection lobby, scientists and 
regulators could then engage in the much-needed, wider-ranging 
debate. An ideal convener for such a discussion would be the Presiden- 
tial Commission for the Study of Bioethical Issues. 

One thing is almost certain: ifthe NIH and scientists do not engage 
with the ethical and animal-welfare issues that are so clearly at the 
forefront of the public mind, Congress will do it for them, and the 
result may well be to shut down virtually all research using great apes, 
as happened in the European Union in 2010. 

The committee plans to gather public input at a meeting in Washing- 
ton DC in August, ona date yet to be announced. 
Researchers would do well to make their views 
known to the IOM committee, which will receive 
and consider all public comment at go.nature. 
com/5tdgkt. = 
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report last month from the European Monitoring Centre for 

Drugs and Drug Addiction in Lisbon, the alarming rise in the 
appearance and abuse of drugs continues. In 2008, some 13 new sub- 
stances were reported. In 2009, this figure rose to 24, and in 2010, 
the highest ever number of new drugs was reported, a total of 41 new 
substances. 

Most were cathinones — related to amphetamines — or synthetic 
cannabinoids. All must have been made by skilled chemists as a delib- 
erate challenge to drug-control laws. 

Should we try to keep ahead of those who make and use these 
materials? Is the effort and expense required for chemists such as myself 
to develop tests for new drugs, and to work with legal professionals to 
increase the number of banned substances, really 
worth it? The simple answer is yes. I have seen 
the effects that these chemicals can have on those 
who take them. In addition to damaging medical 
conditions, these drugs can induce dangerous or 
violent changes in mood and behaviour. I believe 
that society has a duty to intervene. 

Prosecuting a drug case requires a compound 
to be identified and shown to be illegal — not 
always an easy task. Some laws control drugs 
by name and chemical class, thereby outlaw- 
ing a broad family of related chemicals. Others 
ban only specific isomers, opening the door to 
‘designer’ drugs manufactured to mimic the 
effects but not the structure of an illegal com- 
pound. The Internet has facilitated international 
trade in these materials. 

One such new drug is benzylpiperazine, or 
BZP. Developed by the Wellcome Research 
Laboratories in Beckenham, UK, in 1944 as an anthelmintic drug to 
combat parasitic worms in livestock, it was subsequently investigated 
as a potential antidepressant. It entered the party scene a decade or so 
ago asa legal alternative to ecstasy (MDMA), producing similar effects 
but not illegal at that time. Today, ecstasy tablets bought on the streets 
of London or San Francisco are as likely to contain BZP as MDMA. 

BZP exemplifies the problems that new drugs and ‘legal highs’ pose 
for law-makers. Widespread use led to the compound being banned 
in the United States and much of Europe, yet it remains legal in other 
places, such as Canada. 

Work in my laboratory has shown that BZP is not, as many users 
believe, safe. We treated immortalized cell lines from the liver and 
kidney — the excretory organs that clear drugs 


Beer underground chemists have been busy. According to a 


from the body — and fibroblasts, cells involved DNATURE.COM 
in wound-healing, with BZP, its precursor _ Discuss this article 
chemicals and its reaction by-products. We _ onlineat: 

tested both the compounds and the impurities _go.nature.com/3svipe 


PROSECUTINGA 
DRUG CASE REQUIRES 
A COMPOUND TO 


BE IDENTIFIED 
AND SHOWN TO 
BE ILLEGAL — NOT 
ALWAYS AN 


EASY TASK. 


Poison in party pills is 
too much to swallow 


The harm caused by designer drugs justifies the law’s attempts to keep pace 
with underground chemists, says Mike Cole. 


created in their manufacture in isolation, as mixtures and as drug 
blends synthesized to mimic street samples. The concentrations used 
represented those recorded in the body during drug use. 

All these chemicals were hugely toxic to both liver and kidney cell 
lines. The major impurity in BZP, dibenzylpiperazine, is especially 
toxic to the kidneys. One of the starting materials, piperazine hexa- 
hydrate, some of which can make it into the final product, is extremely 
toxic to the liver. These results start to explain the symptoms of renal 
and hepatic failure observed in people who use BZP. 

Toxicity depends on the composition and concentration of the 
mixtures, and the effects are hard to predict. Other side effects include 
insomnia, anxiety attacks, nausea, vomiting and serious palpitations 
that frequently go unreported. These effects become worse when the 
drugs are mixed with alcohol. In short, the effect 
on individuals is potentially significant, long- 
lasting and even fatal. 

Control of such drugs brings its own problems. 
Synthesis of a compound is driven underground. 
BZP is easily manufactured from piperazine 
hexahydrate and benzyl chloride, but the level 
of impurities depends on the precise quantities 
of starting materials, the reaction conditions and 
the procedures used to extract the drug from the 
reaction mixture. This presents a paradox com- 
mon in drug control: the safest option is for peo- 
ple not to ingest the chemicals, which is the aim 
of making them illegal. But making them illegal 
can make them more dangerous. 

In response to this conundrum, people on 
both sides of the debate over whether to crimi- 
nalize drugs often cite the economic benefit of 
their approach, but this argument is a red her- 
ring. Both sides have costs. Outpatient treatment after the ingestion 
of BZP costs hundreds of pounds per patient per visit. In-patient 
care, including treatment in an intensive-care unit, costs thousands 
of pounds a day. Society has a right to frown on and to seek to outlaw 
such costly behaviour. Yet the science behind a strategy of drug pro- 
hibition — quality-control methods in an analytical lab and access to 
forensic services — is expensive too. 

So, returning to the original question: should we continue to outlaw 
recreational drugs, and compounds such as BZP in particular? The evi- 
dence is mounting that even pure drugs are toxic and do harm, both in 
the short and in the longer term. When public health and safety is at risk 
then surely it is socially responsible to ban these substances, and to pro- 
vide a legislative and forensic-science system that supports such bans. = 


Mike Cole is professor of forensic science at Anglia Ruskin University 
in Cambridge, UK. 
e-mail: Michael.cole@anglia.ac.uk 
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Targeted drug 
fights melanoma 


A drug that targets a specific 
mutant protein in skin cancer 
improved survival in a clinical 
trial of 675 patients with 
advanced melanoma. 

The drug vemurafenib 
inhibits a mutated form of the 
cell-growth-promoting protein 
BRAE Mutations in this 
protein are found in around 
half of all melanomas. Paul 
Chapman of the Memorial 
Sloan-Kettering Cancer 
Center in New York and his 
colleagues found that in their 
phase III trial of patients with 
metastatic melanoma and 
the BRAF mutation, almost 
half of those treated with 
vemurafenib responded to the 
drug. By contrast, the response 
rate in patients receiving an 
older chemotherapy called 
dacarbazine was only 5%. 

Six months after treatment, 
84% of those who received 
vemurafenib were still alive, 
compared with 64% of those 
who received dacarbazine. 

N. Engl. J. Med. doi:10.1056/ 
NEJMoa1103782 (2011) 


PHOTONICS 


Rainbow froma 
single LED 


Inorganic light-emitting diodes 
(LEDs) are bright, stable and 
efficient, but usually emit only 
one colour. Gyu-Chul Yi at 
Seoul National University and 
his team have created LEDs 
that can be tuned continuously 
from red to blue (pictured) 
for potential use in the display 
screens of mobile devices. 
Their LED consists of 


Selections from the 
scientific literature 


ANIMAL BEHAVIOUR 


Fitter fish lead the pack 


Schooling fish take up different positions in the 

group according to their aerobic abilities. 
Shaun Killen at the University of Glasgow, 

UK, and his colleagues noted the positions 

of individual juvenile mullet (Liza aurata; 

pictured) of similar size as the fish schooled 

in a swim tunnel in the lab, and measured 

certain animals’ metabolic rates and swimming 

abilities. When schools were swimming at 


nanorods of the semiconductor 
gallium nitride, each coated 
with layers of indium gallium 
nitride. These layers form 
‘quantum wells’ that restrict 
the movement of electrons, 
altering the electrons’ 

energy levels and, ultimately, 
determining the wavelength 
of the LED’s emitted light. The 
thickness of the layers varies 
naturally as they are deposited 
on the rods’ multi-faceted tips. 
By altering an applied voltage, 
the researchers force electric 
current to travel through layers 


of different thickness, thus 
changing the colour of light 
that the LED emits. 

Adv. Mater. doi:10.1002/ 
adma.201100806 (2011) 


NEUROGENETICS 


Extended hunt for 
autism genes 


Boys are four times more likely 
than girls to have autism, and 
two studies hint at why: girls 
with the disorder tend to have 
many more genetic mutations 
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high speed, fish less able to supply oxygen to 
their muscles ended up at the back, where 
they could reduce their workload. By contrast, 
fish with higher aerobic capacity that were 
better able to withstand drag forces took up 
positions at the front. Having fitter fish in the 
lead could allow schools to maximize their 
swimming speed. 

Proc. R. Soc. B doi:10.1098/rspb.2011.1006 (2011) 


than boys, suggesting that 
girls undergo greater genomic 
change before showing autistic 
behaviour. 

Groups led by Michael 
Wigler at Cold Spring Harbor 
Laboratory in New York 
and Matthew State at Yale 
University in New Haven, 
Connecticut, conducted the 
most comprehensive search yet 
for spontaneous duplications or 
deletions of stretches of DNA 
that may be associated with 
autism spectrum disorders. In 
analysing the genomes of more 
than 1,000 people — some 
with autism, some unaffected 
family members — the teams 
found at least 130 sites in the 
genome where spontaneous 
duplications or deletions might 


A. REKKAS/ALAMY 


WILEY-VCH 


contribute to autism risk. 
State’s team found that 
duplication of a region on 
chromosome 7 is associated 
with autism. Autism is marked 
by antisocial behaviour, and 
deletion of the same region is 
linked to Williams—Beuren 
syndrome, a condition that 
involves hypersocial behaviour. 
Ina third study, Dennis 
Vitkup at Columbia University 
in New York and his colleagues, 
in collaboration with Wigley, 
analysed relationships between 
the mutated genes uncovered. 
by Wigler’s genetics study 
that were likely to be involved 
in brain function. Many 
clustered into a large network 
that regulates the creation and 
activity of connections between 
nerve cells. 
Neuron 70, 863-885; 886-897; 
898-907 (2011) 
For a longer story on this 
research, see go.nature.com/ 
bscgf1 


Mercury on 
the decline 


A surprising drop in 
atmospheric mercury levels 
since the mid-1990s points to 
a substantial shift in the global 
biogeochemical cycle of the 
toxic element. 

A team led by Franz Slemr 
of the Max Planck Institute 
for Chemistry in Mainz, 
Germany, compared data from 
monitoring stations in South 
Africa, Ireland and Antarctica, 
as well as measurements taken 
aboard ships in the Atlantic 
Ocean. They infer that, 
globally, mercury levels in the 
atmosphere have decreased by 
20-38% since 1996. 

Industrial mercury pollution 
has remained more or less 
constant over the past 15 years, 
leading the authors to suggest 
that decreasing re-emissions 
from soils and oceans of 
mercury deposited before the 
1990s is the most likely cause of 
the downward trend. They add 
that climate change and ocean 
acidification may further shift 
the global mercury cycle. 
Atmos. Chem. Phys. 11, 
4779-4787 (2011) 


How nicotine 
curbs weight gain 


Nicotine lessens the amount 
mice eat by activating specific 
neurons in the brain, perhaps 
explaining why people who 
stop smoking often gain 
weight. 

Marina Picciotto at Yale 
University in New Haven, 
Connecticut, and her 
colleagues found that mice 
given nicotine daily for 30 
days ate less and had lower 
body-fat levels than untreated 
mice. Nicotine increased the 
firing of brain neurons that 
produce a hormone precursor 
called pro-opiomelanocortin 
(POMC). When the POMC 
neurons fire, they release the 
hormone melanocortin. Mice 
in which the Pomc gene had 
been deleted ate the same 
amount whether or not they 
received nicotine. Those with 
a major melanocortin receptor 
gene silenced ate 
more when given 
nicotine than 
normal mice on 
nicotine. 

The melanocortin 
hormone pathway 
regulates both 
energy use and 
food intake, so the 
authors think that 
nicotine has a two- 
pronged influence 
on body weight. 
Science 332, 
1330-1332 (2011) 


BIOPHYSICS 


Fluorescent cells 
turned into lasers 


A human cell has been 
engineered to form the light 
source ofa tiny laser — creating 
the first laser to use biological 
material to generate light. 
Malte Gather and Seok- 
Hyun Yun at Harvard 
Medical School in Boston, 
Massachusetts, engineered 
human cells to express an 
enhanced version of green 
fluorescent protein. They then 
sandwiched a suspension of the 
cells between two tiny, closely 
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Movies of the body’s bacteria 


> HIGHLY READ 


The rich microbial populations of the 
human body shift frequently in time — 
com even ona daily basis — with no stable, core 


y-6 June group of microbes present at high levels. 
Rob Knight at the University of 

Colorado in Boulder and his colleagues obtained daily 
microbial samples from the faeces, mouth and palms of two 
volunteers, over 6 months for one subject and 15 months for 
the other. The authors sequenced a key genomic region of 
the bacteria to assess the composition of taxonomic groups. 
This revealed that the microbial communities are distinct 
from one body site to the next both in each individual and 
between individuals. However, only a small proportion of the 
observed groups persisted across all time points. The authors 
suggest that factors such as diet, medication and differences in 
immune-system activity may explain the temporal variations. 


Genome Biol. 12, R50 (2011) 


spaced mirrors to concentrate 
and align the light waves from 
the cells into a tight 
beam. By pulsing 
individual cells 
with blue light, the 
researchers excited 
the fluorescent 
proteins, causing 
them to emit light 
(two different lasing 
levels, pictured). 
The result was a 
bright directional 
beam of green laser 
light visible to the 
naked eye. 

Nature Photonics 
doi:10.1038/ 
nphoton.2011.99 (2011) 

For a longer story on this 
research, see go.nature.com/ 
iwdzj9 


CHEMISTRY 


Recipe fora 
good catalyst 


Faced with the challenge of 
developing low-cost catalysts 
for some fuel cells and metal- 
air batteries, researchers have 
come up with a basic recipe 
that ensures high catalytic 
activity in a family of widely 
used materials. 

Perovskite oxides catalyse 


the oxygen-reduction reaction, 
a core process in fuel cells and 
batteries. Yang Shao-Horn 
and Hubert Gasteiger at the 
Massachusetts Institute of 
Technology in Cambridge and 
their group studied 15 different 
perovskite oxide materials, 
which contain transition-metal 
ions. They found that the 
materials’ catalytic activity in 
reducing molecular oxygen is 
strongly dependent on the level 
of occupancy of the transition 
metal’ e, electron orbital. 
Because of various electron 
interactions between atoms of 
the oxide, this occupancy level 
can vary between 0 and 2. With 
one electron in this orbital, 
catalytic activity increased 
by four orders of magnitude 
compared with oxides that have 
0 or 2 electrons in the orbital. 
An occupancy ofless than 1 
led to an interaction with the 
incoming oxygen that was too 
strong, whereas occupancy of 
greater than 1 made it difficult 
for the catalyst to interact with 
and adsorb the molecule. 
Nature Chem. doi:10.1038/ 
nchem.1069 (2011) 
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SEVEN DAYS 


No nuclear in Italy 


The Italian government has 
had to kill plans to reintroduce 
nuclear power, after a 
referendum on 12 and 13 

June voted overwhelmingly to 
keep the nation nuclear-free. 
All nuclear power stations 

in the country have been 
closed since 1990, as a result 
of a referendum held after 

the 1986 Chernoby] disaster. 
Italy is already noted for its 
high dependence on imported 
electricity, which amounted 

to 14% of electricity demand 
in 2009, according to Italian 
utility company Terna. 


Arctic agreement 

A barrier to exploration for 
oil and gas in Arctic waters 
was removed on 7 June when 
Norway and Russia ratified 

a deal on how to share the 
Barents Sea region, which is 
potentially rich in fossil fuel. 
In September 2010, after a 
four-decade dispute about the 
dividing line, the two nations’ 
foreign ministers signed a 
treaty in Murmansk that split 
the Barents Sea equally. The 
treaty will be implemented 
from 7 July. 


} RESEARCH 
Bean sprouts: guilty 


The source of the Escherichia 
coli outbreak that has swept 
across Europe over the past 
month has been identified: 
bean sprouts. The Robert 
Koch Institute, the German 
federal agency for disease 
surveillance in Berlin, 
confirmed the culprit on 

10 June. By 13 June, 36 people 
around the world had died and 
3,324 had been infected. See 
Editorial, page 251. 


Tevatron clash 

Two research groups at the 
Tevatron, the particle collider 
at Fermilab in Batavia, Illinois, 


The news in brief 


Vaccine provides hope for meningitis 


A cheap vaccine that was rolled out in three 
African countries in December has scored 

an early success. Burkina Faso, Mali and 

Niger have all reported the lowest number of 
meningitis A cases ever recorded in an epidemic 
season, six months after 20 million people 
received the MenAfriVac vaccine (pictured; 

see Nature 468, 143; 2010). No one immunized 


disagree about whether they 
have spotted new particles. 

In the past couple of months, 
researchers on the Collider 
Detector at Fermilab 
experiment reported evidence 
of particles not predicted 
under the standard model of 
particle physics. But on 10 June, 
researchers on the independent 
D0 experiment said that their 
data do not confirm the signal. 
The two teams rarely disagree. 
See go.nature.com/t46nns 

for more. 


China’s Moon probe 
China’s lunar orbiter 
Change-2 has left the Moon 
and is heading out into the 
Solar System, state news 
media reported on 9 June. The 
unmanned probe had been 
taking high-resolution images 
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of the Moon's surface as part 
of China’s preparation for a 
future lunar rover. Change-2 

is now headed for L2, a point 
beyond the Moon's orbit where 
the gravitational pull of the 
Sun and Earth are equal. Its 
arrival there in September will 
mark the farthest that China 
has ever sent a satellite. 


Frog fungus 
Chytridiomycosis, a virulent 
fungal disease of amphibians, 
now affects the entire 
mountainous neotropics of 
Central America. On 13 June, 
scientists at the Smithsonian 
Tropical Research Institute 
in Panama announced that 
they had found infected 
frogs at a site bordering 

the Darien National Park, 
previously the only area to 
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is known to have contracted the bacterial 
disease, which periodically kills thousands 
during intense epidemics. The Meningitis 
Vaccine Project — led by the World Health 
Organization and PATH, a non-profit body 
based in Seattle, Washington — plans further 
immunizations, starting with Cameroon, Chad 
and Nigeria later this year. 


escape the infection. The area 
is considered the “best shot” 
for biologists hoping to collect 
frog species to preserve in 
captivity, said Brian Gratwicke, 
a biologist at the Smithsonian 
Conservation Biology Institute 
in Washington DC. “We would 
like to save all of the species in 
the Darien, but there isn’t time 
to do that now,’ he said. See 
go.nature.com/pvelsu 

for more. 


Age of Aquarius 
After two failed launches for 
NASAs Earth observation 
projects, the space agency has 
a success: Aquarius, its satellite 
to measure the saltiness of 

the oceans, reached orbit on 
10 June. The probe will pick 
up weak microwave radiation 
emitted naturally by the 


WHO 


2 ocean. This radiation varies 

i according to the electrical 

> conductivity of the water, 

° which in turn is tied to its 
salinity. Because salinity is 
linked to evaporation and 
water density, the data could 
help scientists to confirm 
theories about the global 
water cycle and its response to 
climate change. See go.nature. 
com/jfwfhf for more. 


Bias in science 

The eminent evolutionary 
biologist and science historian 
Stephen Jay Gould may have 
fudged his numbers, when 
criticizing skull measurements 
by nineteenth-century 
American physician Samuel 
Morton asa classic example 
of bias influencing scientific 
results. Gould — who died 

in 2002 — made the charges 
in 1978 (S. J. Gould Science 
200, 503-509; 1978). But, 

in a7 June paper, a group of 
anthropologists argues that 
most of Gould's criticisms are 
“poorly supported or falsified” 
(J. E. Lewis et al. PLoS Biol. 9, 
e1001071; 2011). See go.nature. 
com/rlszy4 for more. 


High food prices 
World food prices are likely 
to remain “high and volatile’, 
according to the Food and 
Agriculture Organization. 
The agency warned ina 

7 June update that its food 


TREND WATCH 


SOURCE: BP. 


China accounted for just over 
one-fifth of the world’s energy 
consumption in 2010, and 48.2% 


of global coal consumption, 


according to statistics released 
by oil company BP on 8 June 
(see chart). Oil is still the world’s 
leading fuel (33.6%), although 
its market share has dropped 
about 5% over the past decade, 
whereas renewables make up 


just 1.3%. Overall, however, 


every fuel is being used more 
than ever before, as energy use 
has rebounded from a recession- 


induced drop in 2009. 
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price index for May was only 
slightly down from a record 
level in February (see chart). 
Unfavourable weather, rising 
oil prices, political unrest and 
the nuclear disaster in Japan 
have all helped to unsettle the 
food market. The higher prices 
have boosted planting, but this 
year’s harvests — particularly 
for crops with depleted stocks, 
such as maize (corn) — will be 
crucial, the agency said. 


Chinese windfalls 
Huaneng Renewables, the 
wind-energy subsidiary of 
China's largest power producer, 
started trading last week 

after raising HK$6.23 billion 
(US$800 million) in an initial 
public offering (IPO) in Hong 
Kong. The firm had shelved its 
IPO in December because of 

a lack of interest. Its re-entry 
into the market follows 
January's 9.5 billion renminbi 
(US$1.4 billion) IPO of 
Sinovel, China’s largest, and the 
world’s second-largest, maker 
of wind turbines. 


WORLD ENERGY USE 2010 


Physics in Jordan 

A US$110-million 
synchrotron seems to be 

on track for construction 

in Amman, Jordan — 
surviving a global recession, 
political upheaval and the 
assassinations of two members 
of the project's Iranian 
delegation. The Synchrotron- 
light for Experimental Science 
and Applications in the Middle 
East (SESAME) project has 
received commitments from 
Israel, Iran, Jordan and the 
Palestinian Authority to 
provide funding as long as 

two further nations commit 
funds — as Egypt and Turkey 
are expected to do. Chris 
Llewellyn Smith, a physicist 

at the University of Oxford, 
UK, and president of the 
SESAME council, says that he 
is “confident” that the project 
will deliver its first three 
beamlines by 2015. 


California cash 

The W. M. Keck Foundation 
in Los Angeles, California, 
has announced the single 
largest science donation in its 
57-year history: a US$150- 
million gift to the University 
of Southern California, 

Los Angeles, for scientific 
research at its medical school 
and two affiliated hospitals. 
The foundation also gave 
$110 million to the university's 
medical school in 1999. For 


World energy consumption rose by 5.6% from 2009; fossil fuels’ 
contribution remained almost static, at 87% of the total. 
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SEVEN DAYS | THIS WEEK | 


17 JUNE 

The European 
Commission reveals the 
results of a vote on the 
new name of Europe’s 
post-2013 research 
funding system. 
go.nature.com/oxsurs 


20-25 JUNE 

In Lima, Peru, 

scientists preparing the 
Intergovernmental Panel 
on Climate Change’s fifth 
report meet to discuss 
geoengineering, and the 
ethics and economics of 
estimating the ‘cost’ of 
climate change. 
go.nature.com/jfc8le 


20-24 JUNE 

The International 
Atomic Energy Agency 
holds a meeting on 
nuclear safety in Vienna, 
Austria, to identify 
lessons from the 
Fukushima disaster. 
go.nature.com/jwjewx 


the university, the donation on 
13 June follows shortly after a 
record gift of $200 million for 
science research, in March (see 
Nature 471, 271; 2011). 


Vaccine cash bounty 
The GAVI Alliance, a global 
health partnership that focuses 
on getting vaccines into low- 
income countries, has been 
promised US$600 million 
more funding than it expected 
at a meeting in London on 

13 June. The alliance, based 

in Geneva, Switzerland, 
needed $3.7 billion to enable 

a planned $6.8-billion 
expansion of vaccination 
programmes in 2011-15, yet 
donors committed $4.3 billion. 
The pledges include an extra 
$1.3 billion from Britain and 

$1 billion from the Bill & 
Melinda Gates Foundation. See 
go.nature.com/qlldf4 for more. 
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Devastating floods on the Mississippi this year have given researchers a rare opportunity to study how the river deposits the muds that build coastal marshes. 


SEDIMENTOLOGY 


Studies spy on a river’s rage 


Investigation into this year’s Mississippi floods could shape coastal restoration plans. 


BY GWYNETH DICKEY ZAKAIB 
IN VENICE, LOUISIANA 


he research vessel Acadiana rolls with 
Te waves in the Gulf of Mexico, 10 kilo- 

metres off the coast of Louisiana. Scien- 
tists and crew members scan the murky waters. 
Suddenly, triggered by an acoustic signal, a 
cluster of bright-yellow buoys comes bobbing 
up to the surface. 

The captain steers towards the floats, which 
carry a radar instrument that has spent the past 
20 hours on the sea floor. The device has been 
measuring the velocity of the water pouring 
from the Mississippi River, where floodwaters 
have risen to levels not seen in decades. “There's 


araging torrent coming out,’ says Carol Lutken, 
associate director for research programmes at 
the Mississippi Mineral Resources Institute in 
Oxford, Mississippi, which helped to organize 
the expedition. “It’s like a fire hose.” 

The survey earlier this month is part of an 
ongoing interdisciplinary effort by research- 
ers to learn how the flooding river discharges 
water and where it deposits its sediment load. 
Those muds could have a role in restoring 
the diminishing marshes along the Louisi- 
ana coast. The flood “is a catastrophic event, 
but it’s a rare opportunity to understand the 
physics of the Mississippi delta,” says Federico 
Falcini, a physical oceanographer at the Uni- 
versity of Pennsylvania in Philadelphia, and 
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one of the project coordinators. 

Heavy rains across the Mississippi water- 
shed in April led to devastating floods far up 
the waterway, forcing entire communities to 
evacuate. On 14 May, as the high water moved 
into Louisiana, the US Army Corps of Engi- 
neers began opening floodgates in the Mor- 
ganza Spillway, some 450 kilometres upstream 
of the gulf. Their purpose was to divert water 
into the Atchafalaya River, which follows its 
own course to the sea. The move spared devel- 
oped areas downstream — including the cities 
of Baton Rouge and New Orleans. It also set up 
the ideal conditions for a direct comparison of 
river dynamics and sediment deposition in two 
very different waterways. 
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> The Mississippi is hemmed in bya system 
ofembankments and kept open for shipping by 
constant dredging. It reaches the sea in a few 
long, narrow channels, flanked by diminishing 
marshlands. By contrast, the less-controlled 
Atchafalaya emits a diffuse plume, which exits 
through networks of bifurcating channels and 
feeds a growing marsh called the Wax Lake 
Delta. Using satellites to compare the discharge 
of the two rivers, and the Acadiana to very the 
satellite measurements of the Mississippi, Fal- 
cini and his colleagues will try to determine the 
conditions that build healthy wetlands. “The 
theory is, if you can tune the channel geometry 
on the Mississippi River Delta, maybe it will 
do something like what the Wax Lake is doing: 
spreading and making deposition just in front 
of the river,” says Falcini. 

He and Douglas Jerolmack, a geophysicist 
at the University of Pennsylvania, hope to vali- 
date a model in which the faster water is moving 
as it exits a river, the farther into the sea it will 
carry sediment (F. Falcini and D. J. Jerolmack 
J. Geophys. Res. doi:10.1029/2010JF001802; 
2010). This relationship would be especially 
important during floods, which carry unusu- 
ally heavy loads of sediment that contribute to 
marsh-building along the coast. “We're going to 
learn a whole lot that we'll use to inform and 
expand our model,” says Jerolmack. 

Falcini’s team is one ofa few groups studying 


Mississippi 
River 


Wax Lake Delta 
Atchafalaya River 


Top: Sediment pours from the Atchafalaya on 1 June in a more diffuse flow than from the faster Missis: 
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the floods for clues to restoring the wetlands, 
which buffer the coast from hurricanes, floods 
and storm surges by absorbing water and wave 
power. The US Geological Survey says that 
nearly 43 square kilometres of Lousiana’s coastal 
marshes have been disappearing each year since 
1985, owing to sea- 


level rise, subsidence “If we don rt 

and sediment deficit. 40 something 
As a result, the open to save the 
waters of the gulfare landscape, the 
creeping closer to entire coast will 
New Orleans, increas- be gone.” 


ing its vulnerability to 
hurricanes. “We're conscious that if we don't 
do something to save the landscape, the entire 
coast will be gone,” says Steve Mathies, executive 
director of the Louisiana Office of Coastal Pro- 
tection and Restoration in Baton Rouge, which 
is developing a plan for coastal restoration. 

Some scientists say that the best way to save 
the coast is to divert more of the Mississippi's 
floodwaters upstream to increase the amount of 
sediment reaching the marshes, but they don't 
know how to ensure that the sediment will be 
captured where it is needed. Falcini’s model 
suggests that one way would be to widen chan- 
nels that feed into the ocean, so that the water 
slows down and sediments settle out. 

The researchers are using satellites to track 
sediment concentration at the water’s surface, 


Wax Lake Delta 
Atchafalaya River 
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sippi. Bottom: The rivers have strikingly different deltas. 


among other variables. They will validate 
those data using velocity measurements and 
samples gathered by the Acadiana. In the com- 
ing weeks, the floodwaters will subside, and 
project collaborators will fly a helicopter along 
the 250 kilometres or so of coastline between 
the Atchafalaya and Mississippi deltas, landing 
at regular intervals to take sediment samples 
and discover where the flood has added or 
eroded soil. Taken together, these measure- 
ments should help to determine which river 
characteristics can be tuned to build healthy 
marshes. Independent teams working farther 
up the river will add detail. 

Besides shaping restoration plans, says 
Mathies, projects such as Falcini’s could also 
help to build support for restoration measures, 
which will be expensive and could meet resist- 
ance from stakeholders such as fishermen and 
shipping interests. “We need to be able to show 
people what the returns will be,’ says Mathies. 

Louisiana officials have started to take a “gen- 
erational view” of coastal restoration, accepting 
that long-term benefits trump short-term inter- 
ests, says Robert Twilley, a coastal-systems ecolo- 
gist at the University of Louisiana at Lafayette. 
The trick to maintaining a healthy coastline is to 
minimize the damage from floods while maxi- 
mizing the benefits, he adds. “We need to think 
how we can reconfigure the river to accomplish 
both flood control and restoration.” m 
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Real-time observation of yeast genes tagged to fluoresce when transcribed into RNA could help synthetic biologists to design better circuits. 


| SYNTHETIC BIOLOGY | 


Life hackers seek new tools 


Field aims to enlist techniques from molecular biology to attack fundamental challenges. 


BY ERIKA CHECK HAYDEN 


thetic biology before: life is too complicated 

to be manipulated by human designers; 
those who try have managed to cobble together 
only rudimentary genetic circuits from a lim- 
ited suite of parts; the results are notoriously 
unpredictable. Meanwhile, a few high-profile 
successes — such as last year’s creation of a bac- 
terium with a synthetic genome’ — and enthu- 
siastic claims that the field will solve a raft of 
complex health, environmental and engineering 
problems, only increase the pressure to deliver. 

Lucks, however, is undaunted. Last month 
he, his wife and their young child moved across 
the United States, from Berkeley, California, to 
Ithaca, New York, where he will set up his first 
independent lab in the discipline at Cornell 
University. His optimism is representative of 
a new generation of synthetic biologists who 
are gathering to chart the course of their field 
this week at a conference at Stanford University 
in California. 

Jeff Tabor, a bioengineer at Rice University in 
Houston, Texas, says that one goal of the con- 
ference, the fifth Synthetic Biology Meeting, is 
to bring more traditional molecular biologists 
“into the fold’, both to counter their intrinsic 

resistance to the concept 
of re-engineering life 


Jes Lucks has heard the criticisms of syn- 


For more on and to co-opt their tools. 
synthetic systems “There is a real differ- 
biology, see: ence in the way that Iand 


people younger than me 


see biology and think about studying cells,” 
Tabor says, “but there are a tonne of scientists 
doing molecular biology work that is improv- 
ing our ability to engineer biology.” 

For example, ‘next-generation’ sequenc- 
ing machines, designed to vastly speed up the 
reading of genomes, can also offer synthetic 
biologists a better way to observe cellular 
behaviour. That in turn will help them design 
better circuits — for instance, by giving them a 
quantifiable readout of how a circuit’s modifica- 
tions affect its function. In a paper’ published 
this month by Lucks 
and his colleagues 
at the University of 
California, Berkeley, 
the group inferred 
the three-dimen- 
sional shapes of small 
RNA molecules by 
sequencing the cor- 
responding DNA, using a technique called 
SHAPE-Seq. That strategy could help synthetic 
biologists to screen large pools of RNA rapidly 
to find those with certain structural charac- 
teristics that could be incorporated into RNA 
circuits. 

Another tool that synthetic biologists hope 
to adopt was published in April’ by a team 
led by structural biologist Robert Singer of 
the Albert Einstein College of Medicine in 
New York. By tagging particular genes with 
a signalling molecule that fluoresces every 
time the gene is transcribed, the researchers 
can watch and quantify transcription in real 
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time. The work could give synthetic biologists 
the equivalent of an electrician’s circuit tester, 
helping them to engineer more predictable 
biological circuits. 

“Using a technology like this, you can see 
exactly what a circuit is doing and count the 
number of circuit signals that are being pro- 
duced in real time in live cells,’ Tabor says. 
“This is exactly what we need to help us put 
circuits together.” 

Bioengineer Adam Arkin, Lucks’ mentor 
at Berkeley, has pursued the idea that circuits 
can be made more reliable by basing parts 
on existing cellular components that already 
accomplish a certain function in the cell. Such 
‘mother parts’ could be tweaked slightly to 
yield ‘families’ of parts with similar features 
that could carry out their functions indepen- 
dently and efficiently. 

In April, the team published a proof of con- 
cept for this approach‘ in which they tweaked 
an RNA-based gene-regulation system to 
simultaneously control the expression of mul- 
tiple genes in a cell from the bacterium Escheri- 
chia coli, and even make a simple RNA circuit. 
Because the system is entirely RNA-based, it 
eliminates the need to translate a messenger 
RNA into a protein regulator, thereby reducing 
the overall complexity of the system. 

Another approach to complexity involves 
designing multicellular circuits in which 
each cell is a circuit component. This neatly 
skirts the dilemma of trying to insulate the 
parts of a circuit from one another within 
the cytoplasm ofa single cell. Chris Voigt, 
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> who is moving from the University of 
California, San Francisco, to co-direct a 
new synthetic-biology institute at the Mas- 
sachusetts Institute of Technology in Cam- 
bridge, has been pursuing this approach in 
his lab with colleagues who published their 
proof of principle last December’. “There's 
been a change in the scale of the problems 
that we can address, and this comes out of 
the tools that synthetic biology can pro- 
vide,” says Voigt. At the meeting, Voigt will 
describe his lab’s attempts to re-engineer the 
way some organisms convert nitrogen into a 
useable form through a molecular pathway 
that involves dozens of genes. 

Synthetic biologists have been organ- 
izing their own initiatives to tackle other 
obstacles. For instance, one of the field’s key 
tenets is that off-the-shelf molecular ‘parts 
could be used to program cells to carry out 
specific functions, such as making a drug or 
a biofuel. But such ambitious goals depend 
on the quality of the available parts®. So, in 
late 2009, an initiative called the BIOFAB 
(International Open Facility Advancing 
Biotechnology), funded by the US National 
Science Foundation, began working to design 
reliable parts with known functions. The 
BIOFAB has now made about 3,000 well- 
characterized parts and has released around 
500 as a higher-quality curated collection. 

Yet money is scarce for this kind of work 
—achallenge to be addressed at a conference 
workshop that will include funding agencies 
and industry. “There needs to be a frank and 
open discussion about funding in synthetic 
biology, especially in the United States,” says 
Pam Silver, a systems biologist at Harvard 
University in Boston, Massachusetts. The 
bread-and-butter work that the field needs, 
such as fine-tuning circuitry, is more applied 
than most ‘hypothesis-driver research that is 
the remit of agencies such as the US National 
Institutes of Health. And most funders want 
applicants to focus on specific agendas, such 
as health or biofuels. 

Indeed, Rob Carlson, a principal at the 
engineering, consulting and design company 
Biodesic in Seattle, Washington, wonders 
whether the field of synthetic biology is big 
enough to become a well-oiled engineering 
machine. This week’s conference is sold out 
at 700 attendees, with a waiting list of at least 
100, but as Carlson points out, many of those 
attending will be reporters and investors. 

“Given the complexity of the task at hand, it 
doesn't surprise me at all that we are still going 
slowly,’ says Carlson. = 
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Mutant mice generated from embryonic stem-cell lines should further understanding of human disease. 


Mouse library set 
to be knockout 


Global effort to disable every mouse gene nears completion. 


BY ELIE DOLGIN 


largest international biological research 

initiative since the Human Genome Pro- 
ject. Launched in 2006 in North America and 
Europe, the effort aims to disable each of the 
20,000-odd genes in the mouse genome and 
make the resulting cell lines available to the 
scientific community. 

After five years and more than US$100 
million, the pace is picking up. “In the next 
three years or so we assume we will have it 
completed,’ says Wolfgang Wurst, director of 
the Institute of Developmental Genetics at the 
Helmholtz Centre Munich in Germany and 
one of the leaders of the effort’s European 
contribution. 

“This resource will be of enormous benefit, 
not just to the mouse genetic community but 
to every scientist, every company looking at 
mammalian physiology, and of course every- 
one who wants to design better drugs and 
better health care,’ says Steve Brown, direc- 
tor of the Mammalian 
Genetics Unit at MRC 
Harwell, UK. “It is one 
of the most significant 
biological resources in 


[mete are on the home stretch of the 


> NATURE.COM 
For more onthe 
mouse genome, see: 
go.nature.com/4iifql 
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the past century of science, and I dort think 
I’m overstating the case here.” 

Previously, researchers typically spent years 
engineering mice to lack specific genes so that 
they could model human diseases involving 
those genes. This process was slow, labori- 
ous and piecemeal. And even after all that 
effort, there was often no easy way to share 
the animals with other researchers. So the 
International Knockout Mouse Consortium 
(IKMC) set out to create a library of mouse 
embryonic stem-cell lines representing every 
possible gene knockout, and then to distribute 
the cells to researchers for further study. 

A new technology — pioneered by Bill 
Skarnes and Allan Bradley at the Wellcome 
Trust Sanger Institute in Hinxton, UK, and 
described today in Nature (W. C. Skarnes 
et al. Nature 474, 337-342; 2011) — helped 
make that possible. Using a high-throughput 
gene-targeting pipeline that allowed them to 
precisely engineer hundreds of genes every 
month, the Sanger team, in collaboration with 
colleagues in Germany and the United States, 
has so far inactivated more than 9,000 genes 
in mouse embryonic stem cells. It is on track 
to knock out 7,500 more in the next few years. 
“We're really hitting our peak production now,’ 
Skarnes says. 
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Each bespoke knockout in the Sanger 
group’s library contains an added ‘condi- 
tional allele. This allows scientists to dis- 
rupt gene function in a living mouse at any 
body site and at any point in the animal’s 
development by the timely addition of 
enzymes that recognize the inserted allele. 
By this means, the effects of the miss- 
ing gene do not kill the mouse before the 
researchers have a chance to study it. 

“It is truly a feat of genius, says Geoff 
Hicks, a geneticist at the University of 
Manitoba in Winnipeg who leads the 
Canadian contribution to the IKMC. “This 
paper really pushed the technology in an 
extremely innovative way and met a chal- 
lenge that seemed unattainable” 

Various groups in the international 
effort are using other, non-conditional 
techniques to inactivate thousands more 
genes. Researchers in Texas, Canada and 
Germany have mutated close to 12,000 
genes using an untargeted approach called 
gene trapping, and Regeneron Pharmaceu- 
ticals, a company based in Tarrytown, New 
York, has specifically targeted around 3,500 
genes using a technology that works well in 
smaller genes but results in mice that are 
less flexible for research than conditional 
knockouts. “The approaches are comple- 
mentary,’ says Aris Economides, Regener- 
on’ senior director of genome engineering 
technologies. “This is going to play out well 
for the end user” 

To date, nearly 17,000 different genes 
have been knocked out, leaving only around 
3,000 more to go. The 


Sanger team, how- “Itis one 
ever, hopes to replace of the most 
estothegsnes significant 
conditionally targeted biologi cal 
resources 
knockouts, because. 
: ..,, inthe past 
targeting allows indi- 
century of 


vidual genes to be 
manipulated with 
greater precision. 

Already, mutant mice have been gener- 
ated from almost 1,000 of the embryonic 
stem-cell lines obtained, and the IKMC 
repositories in the United States, Canada 
and Europe receive hundreds of new orders 
every month. The next challenge is to study 
the function of each missing gene. To this 
end, the US National Institutes of Health last 
year committed $110 million over the next 
five years to characterize around 2,500 of 
the IKMC’s mutant mice through the Inter- 
national Mouse Phenotyping Consortium, 
with plans for another $110 million to define 
5,000 more if the first phase is successful. 

“Knocking out the mice is simple rela- 
tive to the huge task of finding out what 
all those genes do,” says Richard Finnell, a 
geneticist at the Texas A&M Health Science 
Center in Houston. = 


science.” 
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Species spellchecker 
fixes plant glitches 


Online tool should weed out misspellings and duplications. 


BY JOHN WHITFIELD 


rian Enquist and his collaborators 
B were delighted with their freshly com- 
piled data set of 22.5 million records 
on the distribution and traits of plants in the 
Americas. But their delight turned to horror 
when they realized that the data set contained 
611,728 names: nearly twice as many as there 
are thought to be plant species on Earth. 
Completed in December 2010, the records 
were intended to help Enquist and his col- 
leagues to discern trends in how forest trees 
in a wide variety of environments respond to 
climate change. But the data were clearly full 
of bogus names, making it impossible to count 
the species in a particular area, or their relative 
abundance. “I started to question our ability 
even to compare something as basic as spe- 
cies diversity at two sites,” says Enquist, a plant 
ecologist at the University of Arizona in Tucson. 
This month, Enquist’s team will unveil a solu- 
tion that could help botanists and ecologists 
worldwide. The Taxonomic Names Resolution 
Service (TNRS) aims to find and fix the incor- 
rect plant names that plague scientists’ records. 
“Tt looks really good,” says Gabriela Lopez- 
Gonzalez, a plant ecologist at the University 
of Leeds, UK, who curates a database of forest 
plots. Fixing species lists by hand is arduous, 
she says. “This should save us a lot of time”. 
She and others agree that the problem is 
widespread in botanical databases. “Digitiza- 
tion has made the problem worse,’ says TNRS 
co-leader, botanist Brad Boyle, also at the 
University of Arizona. Boyle explains that as 
more data are added to digital records, the 
chance of introducing errors also increases. 
Even in herbarium specimens, which ought to 
be the gold standard for plant identification, 
about 15% of the names are misspelt, he says. 
Many of the errors seem to arise because 
biologists are not as careful as they should be 
when entering data into digital records. The 
TNRS team estimates that about one-third of 
the names entered into online repositories — 
such as GenBank, the US National Institutes 
of Health collection of DNA-sequence data, or 
the Ecological Society of America’s VegBank 
database of plant-plot data — are incorrect. 
The other problem is that names change. Old 
names can be abolished when experts reclassify 
plants as ideas about evolutionary relationships 
change, or when they realize the species already 
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Would it smell as sweet by any other name? 


had a name — an occurrence almost as old as 
taxonomy itself. The result is that the same 
plant can have many names, and not everyone 
knows which one to use. Such synonyms are a 
particular problem in the study of medicinal 
plants, says Alan Paton, a plant taxonomist and 
bioinformatician at Kew Gardens in London. 

The TNRS was built with financial and 
technical support from iPlant, a project run by 
the US National Science Foundation to fund 
cyberinfrastructure for plant science. It cor- 
rects names by comparing lists that users feed 
into it with the 1.2 million names in the Mis- 
souri Botanical Garden's Tropicos database, one 
of the most authoritative botanical databases. 
If the TNRS cannot find a name in Tropicos, 
it uses a fuzzy-matching algorithm, similar to 
a word-processor’s spellchecker, to find and 
correct misspellings. It also hunts through 
Tropicos’s lists of alternative names and supplies 
the one that is most up to date. When Enquist 
ran the 611,728 names through the system, just 
202,252 came back, showing that two-thirds of 
them were invalid. 

Because Tropicos is less comprehensive for 
plants outside the Americas, the team hopes 
to link the TNRS with The Plant List (www. 
theplantlist.org), a collaborative compilation 
of databases from Kew and other sources. 
Launched online in December 2010, it aims to 
become a global record of plants. The scientists 
are also working ona tool to correct geographi- 
cal data — one that knows, for example, that 
Brazil, Brasil and Brésil are the same place, and 
can recognize when someone has muddled up 
longitude and latitude. = 
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Each bespoke knockout in the Sanger 
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rupt gene function in a living mouse at any 
body site and at any point in the animal’s 
development by the timely addition of 
enzymes that recognize the inserted allele. 
By this means, the effects of the miss- 
ing gene do not kill the mouse before the 
researchers have a chance to study it. 
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have been knocked out, leaving only around 
3,000 more to go. The 
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Europe tackles huge fraud 


Regulators scramble to recover millions of euros awarded to fake research projects. 


BY QUIRIN SCHIERMEIER 


discouraging scientists and businesses 

from participating in the research 
programmes of the European Commis- 
sion (EC). But the commission's notoriously 
cumbersome procedures and rigid con- 
trol mechanisms have apparently not 
prevented a criminal syndicate from 
conducting a brazen fraud that has 
siphoned off millions in EC grant funds. 

Italian authorities and the European 
Anti-Fraud Office (OLAF) in Brussels, 
Belgium, have confirmed that they 
are prosecuting members of a large 
network accused of pocketing more 
than €50 million (US$72 million) in 
EC grants for fake research projects. 
In Milan, Italy, the Finance Police last 
month charged several individuals 
in relation to the fraud. In Brussels, 
meanwhile, the EC has terminated four 
collaborative projects in information 
technology, and excluded more than 
30 grant-winners from participation in 
around 20 ongoing projects. Investiga- 
tions are still under way in the United 
Kingdom, France, Greece, Austria, Swe- 
den, Slovenia and Poland. 

“We don't have any records of [previ- 
ous] fraud at such a scale,” says David 
Boublil, the commission’s spokesman 
for taxation, customs, anti-fraud and 
audit. While investigations continue, 
Italian prosecutors and OLAF will not 
disclose the names of the suspects, or 
the research projects with which they were 
involved. 

The fraud has been conducted in a “highly 
sophisticated manner, resembling money laun- 
dering”, by means of a cross-border network 
of fictitious companies and subcontractors, 
says Pavel Botkovec, a spokesman for OLAE. 
Several project coordinators stand accused of 
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having claimed inflated costs, or expenses for 
non-existent research activities and services, 
he says. 

“The projects were apparently organized 
with the sole intention to deceive the com- 
mission and its control mechanisms,’ says 
Boublil. To make them seem legitimate, grant 
applications included the names of real 


Herbert Reul, the European Parliament research committee chair, 
hopes a case of fraud will not hinder easing of grant applications. 


scientists, established research institutes and 
existing companies, he says. But in most cases 
the alleged project partners were included 
without their knowing. 

Insiders in Brussels say that rare cases of 
minor financial dishonesty, from inflated 
invoices to smaller cases of embezzlement, are 
regarded as unavoidable in large collaborative 


research projects. But the commission does 
extensive checks on project partners, includ- 
ing companies, which are meant to catch large- 
scale fraud. The success of the fraud suggests 
that those involved were unusually familiar 
with weaknesses in the EC’s procedures, and 
adept at forging legal documents. 

Boublil insists that the commission has 
learned lessons from the case. All 
departments handling research grants 
— including the EC’s Information Soci- 
ety and Media Directorate General, 
which oversaw the terminated projects 
— are now trained to look out for the 
methods used by the network. Guide- 
lines for evaluating projects and their 
partners are set to be updated. The EC 
has already recovered €10 million of the 
money, and will seek to recover the rest 
through the courts, Boublil says. 

The commission is currently develop- 
ing a multibillion-dollar ‘Common Stra- 
tegic Framework which, from 2014, will 
combine its various funding streams 
into a single channel for all research and 
innovation funding. Concerned about 
the burden of Brussels bureaucracy, 
several thousand European scientists 
signed a petition this year (www.trust- 
researchers.eu) calling for the frame- 
work to be “based on mutual trust and 
responsible partnering”. Some now fear 
that the fraud could hamper efforts to 
cut red tape. 

“Tm worried that some will argue 
that what has happened proves that we 
need more rather than less control,” says 
Herbert Reul, chair of the European Parlia- 
ment’s committee on industry, research and 
energy, which supports the simplification of 
the EC’s funding procedures. “I sincerely hope 
that this will not happen. Actually, it isa good 
sign that this worrying attempt at deceiving 
the commission has been discovered and will 
be punished.” m 
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Students will be grateful for a substantial boost to Egypt’s education budget. 


MIDDLE EAST 


Egypt invests in 
its science 


Latest budget establishes research as a national priority. 


BY DECLAN BUTLER 


our months after Egypt’s revolution 
toppled the authoritarian regime of 


President Hosni Mubarak, science 
and education are slowly emerging from the 
post-revolution chaos as national priorities. 
Revitalizing Egypt’s sclerotic and chronically 
underfunded research, education and inno- 
vation systems will require sweeping reforms 
and substantial rises in spending. But modest 
funding increases and a warmer political 
climate for research and education have left 
Egyptian scientists feeling more optimistic 
than ever before. 

“We are going to build our economy to be 
based on democracy, and science and tech- 
nology,’ says Maged Al-Sherbiny, president 
of the Academy of Scientific Research and 
Technology in Cairo and assistant minister 

for research. 


> NATURE.COM On 1 June, the Egyp- 
FormoreontheArab tian cabinet approved 
awakening, see: the first post-revolution 
go.nature.com/oxsoag budget, which boosted 
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science despite the severe social and economic 
crises gripping the country. Research spending 
will rise from E£2.4 billion (US$404 million) 
to E€3 billion in the 2011-12 financial year. 
The education budget also jumped, by 16% to 
E£55.7 billion. 

The increase in science spending still 
leaves it at only around 0.4% of gross domes- 
tic product (GDP), much less than the 1-2% 
that researchers say they would like. The goal 
is to reach that level within 4 years, says Al- 
Sherbiny (see ‘Grand plans’). “That target is 
optimistic,” cautions Tarek Khalil, president 


GRAND PLANS 


Egypt is planning to substantially increase the 
proportion of GDP that it spends on research 
over the next four years. 
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and provost of Nile University in Cairo, “but 
if we can do it, great.” 

Over the next three years, the government 
plans to create 50,000 research posts for young 
researchers, most of which will be government 
jobs at universities and research institutes. An 
extra E€2 billion has been secured for the plan, 
according to Al-Sherbiny. Several thousand of 
the posts would be subsidized positions within 
industry, part of a broader goal to boost the 
almost non-existent levels of research in the 
private sector, he adds. 

Al-Sherbiny and science minister Amr Ezzat 
Salama have also proposed other reforms, 
including raising researchers’ salaries and 
introducing performance-based bonuses, for 
which E£1.3 billion has been secured, says 
Al-Sherbiny. Under the proposed reforms, 
the number of government research institutes 
would increase from 198 to 258 — including 
large new centres in microelectronic systems 
and solar energy. The expanded research efforts 
will focus on seven areas considered important 
for Egypt: renewable energy, with an emphasis 
on solar and wind; water, including desalina- 
tion, irrigation and groundwater management; 
food and agriculture; health, including hepati- 
tis C, cancer and obesity; information technol- 
ogy; space; and socioeconomic goals such as 
increasing science in the classroom. 

The fall of Mubarak may also finally open the 
way to a decade-old proposal by Nobel laure- 
ate Ahmed Zewail, an Egyptian-born chem- 
ist at the California Institute of Technology in 
Pasadena, to create a US$2-billion independ- 
ent, non-profit science city that would include 
centres of excellence, hire top researchers and 
teach the cream of the country’s students. The 
state has provided 120 hectares in 6th of Octo- 
ber City, outside Cairo, but public funding for 
the project looks set to be minimal. Zewail still 
needs to raise most of the $1 billion needed to 
establish the city, and a further $1 billion as an 
endowment, through philanthropy and for- 
eign aid. Zewail has set up a board of trustees 
that includes six Nobel laureates and other 
prominent individuals such as Susan Hock- 
field, president of the Massachusetts Institute 
of Technology in Cambridge, who will direct 
the project. 

Farouk El-Baz, an Egyptian-born geologist at 
Boston University in Massachusetts, says that 
even though the political interest in science has 
not yet translated into adequate funding and 
reforms, scientists must take into account the 
many other pressing post-revolution demands, 
and be patient. “I don’t think the reforms are 
enough yet, but they are going in the right 
direction; there is no question about that.” 

Al-Sherbiny says that solidarity from the 
international scientific community will help. 
“This is a time when our friends and part- 
ners need to stand by us to help us realize 
our dreams, to offer to work together, to offer 
expertise and money, to help us build the new 
system we are trying to establish,” he says. m 
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Chimpanzees at the New Iberia Research Center in Louisiana are some of very few remaining worldwide that are still being used in invasive research. 


CHIMPANZEE 


RESEARCH ON TRIA 


As pressure from activists builds, the United States is considering 
whether it should end invasive experiments in chimpanzees. 


BY MEREDITH WADMAN 


he unusual meeting was held in a conference room, but it 
might have been called a war room. Gathered inside a little- 
known research centre in southern Louisiana, the people 
who oversee chimpanzee research in the United States were 
preparing to battle for the survival of their enterprise. 
Although no other country besides Gabon carries out invasive 
experiments with chimpanzees, the United States continues such 
work at three major research facilities. Louisiana's New Iberia Research 
Center (NIRC) is the largest, with a population of 360 chimps, used by 
investigators from pharmaceutical companies and federal agencies to 
test new drugs and study diseases such as hepatitis. During the meeting, 
Thomas Rowell, director of the NIRC, stood up, surveyed the audience, 
and launched into a presentation about possible strategies to build pub- 
lic support for their work. 
“How do we get industry to be forthcoming about their use of chim- 
panzees?” a slide read. 
“Could we get at least a few solid examples of how the use of chim- 
panzees has truncated the time to discovery?” 
And “When we talk about time and lives saved by using chimpanzees, 
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can we provide actual time span data or numbers?” 

Another slide went on to note that the National Institutes of Health 
(NIH) spends about US$12 million a year caring for the chimpanzees 
it supports (currently totalling 734), versus the billions in health-care 
costs for the human diseases that can be studied through experiments 
on chimpanzees. One of them, hepatitis C, currently affects at least 
170 million people globally. If researchers don't have access to the 
chimp model, said Rowell, people afflicted with hepatitis C will suffer. 
“Their lifespans are going to be shortened. They will not have a proper 
quality of life” He called them a “silent voice”. 

Rowell’s pep talk in April was partly for the benefit of some visitors 
at the meeting: representatives from the Food and Drug Administra- 
tion, the National Institute of Allergy and Infectious Diseases, the drug 
industry and, most importantly, the Institute of Medicine (IOM). The 
IOM, the medical branch of the independent National Academy of 
Sciences, was asked by the NIH in January to examine whether the 
government should keep supporting biomedical research on chim- 
panzees — the closest living relatives of Homo sapiens. 

The NIH called for the study after the agency sparked a storm of 
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opposition last year, when it announced plans to move 186 semi-retired 
chimps back into active research’. After protests by the Humane Society 
of the United States (HSUS) in Washington DC, famed primatolo- 
gist Jane Goodall and others, the NIH changed course and said that it 
would make no decision on moving the chimps until the IOM study 
is complete. The study, it announced, would be “an in-depth analysis 
to reassess the scientific need for the continued use of chimpanzees to 
accelerate biomedical discoveries”. 

Proponents say that the research is necessary for continued progress 
towards a hepatitis-C vaccine; for developing more effective drugs 
against hepatitis B and C; for testing monoclonal antibody treatments 
for a variety of conditions; and for research to develop a vaccine against 
respiratory syncytial virus, a seasonal virus that kills more than more 
than 66,000 children under the age of 5 each year across the globe’. For 
many of these conditions, backers argue, the chimpanzee is either the 
only available model, or by far the best one. 

But chimpanzee research in the United States is facing growing 
public and political opposition. Animal-welfare activists have stepped 
up their efforts to end the work, arguing that it is inhumane, ineffective 
and a waste of taxpayer money. The day after the meeting, activists held 
a press conference on Capitol Hill to mark the introduction of the Great 
Ape Protection and Cost Savings Act. The act would make all invasive 
chimpanzee research illegal, including private-sector work conducted 
at the centres and paid for by drug companies. The bill’s lead sponsor 
in the House of Representatives is Roscoe Bartlett (Republican, Mary- 
land), who trained as a physiologist and conducted primate research 
with NASA and with the military in the 1960s. 

“There's just no valid argument to continue to keep these great apes 
as they’re now being kept,’ Bartlett told the news conference. “Very 
few of them are used in research and I’m not sure that any of them 
need to be used.” 

The scrutiny this year adds to the tension felt by researchers who 
work with chimpanzees. That stress is particularly intense at the NIRC, 
which has been on the defensive ever since a television documentary 
two years ago showed footage of employees there mistreating and 
neglecting chimpanzees and macaques. The NIRC, which is part of 
the University of Louisiana at Lafayette, later paid a fine and has since 
passed numerous inspections, but the exposé helped to propel the 
activism. In the contest for public support, says Rowell, “our backs are 
up against the wall”. 


ASTUDY UP CLOSE 

On the same day that the chimp-protection measure was introduced in 
Congress, staff at the NIRC prepared to start a drug-company trial that 
used two chimpanzees to test the absorption, metabolism and excre- 
tion of an experimental medication. One of the animals was Simba, an 
88-kilogram male around 40 years old. That morning he was coaxed 
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Thomas Rowell directs the New Iberia Research Center in Louisiana. 
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from his outdoor enclosure, where he lives in a large social group, into 
an individual cage. A technician used a needle and syringe to sedate 
him. He was then strapped to a stretcher and transported by ambulance 
to Building 52 to receive a pre-study physical examination. 

At 9:27 a.m., Simba was slid off the stretcher — where it became clear 
that he had defecated — and onto a stainless-steel gurney. His fleshy 
pink gums were relaxed and prominent. He was drooling. 

“I need to do a dental on him,” said Dana Hasselschwert, head of 
the veterinary-sciences division and one of nine veterinarians on the 
NIRC staff. The veterinarians care for the centre’s chimps, along with 
its 6,500 macaques and other monkeys. Today, three technicians are 
assisting Hasselschwert with the physical. Speed is important, because 
the sedative is short-lived. Fully alert chimpanzees are strong and 


sometimes violent. 
One technician quickly shaved Simba’s forearms, armpits and groin. 
On the skin of his right groin, a tattoo identified him as chimpanzee 
number xo19. The other technicians placed electrodes on his body; 
his electrocardiogram revealed a regular 
heart rhythm. Simba’s blood pressure was 
‘IT IS UNETH ICAL 143/87 millimetres of mercury — normal 
NOT T0 USE THE for him, Hasselschwert said. Blood was 
drawn from Simba’s left femoral vein; his 
CH IMP MO D FL rectal temperature was taken and was nor- 
mal, at 37.3 °C. His pulse was 104 beats per 
FOR CERTAIN minute; his respirations 32. 
INDICATIONS 9 Hasselschwert palpated his liver and 
7 kidneys and found nothing abnormal. But 
one of the technicians was having trouble 
catheterizing Simba to collect a urine sample. Hasselschwert placed an 
ultrasound paddle on Simba’s lower abdomen and located his bladder 
ona nearby screen. An assistant quickly shaved the overlying area. 

“It’s undignified, a male having bikini marks,” Hasselschwert 
declared. She inserted a needle through Simba’s abdominal wall and 
withdrew three millilitres of pale yellow urine. 

Simba’s breathing was speeding up, a sign of growing wakefulness. 
“Yall, we need to move,’ Hasselschwert said. She wiped Simba’s drool- 
ing gums with paper towels, and patted his open palm. His hand was 
half again as big as hers. “He looks good, she declared, and, at 9:40 a.m., 
Simba was wheeled away on the gurney and placed in a wire cage that 
measured 2 metres long by 1.5 metres wide by 2.2 metres high. The 
cage is one of many in the room, and it can be compressed if an animal 
refuses to present an appendage for injections or blood withdrawal — a 
procedure that staff call “squeezing up”. Three days later, Simba would be 
injected with the experimental drug. After that, for 72 hours, at regular 
intervals, his blood would be drawn and his urine collected from a pan 
beneath the cage. He would then be returned to his outdoor enclosure. 

Last year, the NIRC conducted 23 chimpanzee studies, which typically 
involve between two and six animals. On the day of Simba’s physical, 
ten chimps were in experiments. The remaining chimps are kept in the 
outdoor cages. To keep the chimps prepared for being research subjects, 
trainers reward them with fruit in exchange for presenting their legs 
for mock injections, or for urinating in a cup. The chimps are wary of 
strangers, at whom they are wont to hurl gravel or faeces. 

Chimpanzee studies are expensive, costing anywhere from $20,000 
to $250,000. And roughly 85% of the revenue for the NIRC comes from 
a score of pharmaceutical companies that are regular customers. (Other 
centres tilt more towards academic and governmental clients.) As well 
as conducting drug and vaccine trials, the NIRC breeds macaques for 
several companies, and is a registered importer of the monkeys. 

The other 15% of the centre’s revenue comes from government 
agencies, mainly the NIH. The biomedical agency owns 124, or roughly 
one-third, of the NIRC chimpanzees, and pays the centre to maintain 
two breeding colonies of macaques. The centre also conducts chim- 
panzee research under contract for the Centers for Disease Control 
and Prevention. It owns 11 more chimps, which are kept at Bioqual, 
a company in Rockville, Maryland, where young animals are used in 
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hepatitis-C studies run by the NIH’s infectious-diseases institute. 

If the IOM were to recommend that the NIH stop supporting 
chimpanzee research, and if the NIH were to comply, this would, the- 
oretically, not affect the drug-company funded research at the NIRC 
and the other centres. But in practice, the directors say, it would hobble 
their enterprise, not least because some two-thirds of the chimpanzees 
available for research in the United States are owned or supported by 
the NIH (see ‘Chimpanzee Research in the United States’). What is 
more, they say, the per diem fees and user fees paid by companies for 
individual experiments do not begin to cover the long-term care of the 
animals, which is supported by NIH infrastructure grants. 

“The lifetime maintenance of chimpanzees requires a long-term com- 
mitment of financial support that individually sponsored studies do not 
provide,’ the directors wrote in a jointly authored statement to Nature. 


INSPIRED WORK 
Rowell, who is 52, has been working with chimps most of his life, ever 
since he took a job cleaning cages at the NIRC when he was 17. He 
quickly got a taste of the value of chimpanzee research, when Carleton 
Gajdusek shared the 1976 Nobel Prize in 

Physiology or Medicine for discovering “STOP USING THE 
that neurodegenerative disorders such as 

kuru and scrapie are transmitted by infec- EXCUSE THAT 
tious agents’. As part of his research, Gaj- 

dusek injected infected human brain tissue CH | M PS ARE 
into chimps from the centre. ESSENTI AL T0 

The thrill of Gajdusek’s work rubbed off 9 
on Rowell. “This is what was so exciting THIS RESEARCH. 
— a teenager off the street working at the 
level that I was — and being involved with something so huge.’ Rowell 
chose his career at that point, and earned a degree in veterinary medi- 
cine at the Louisiana State University School of Veterinary Medicine. 
Hired by the NIRC in 1990, he became its director in 1998. 

Rowell has expanded the centre significantly, from about 170 
employees in 1998 to 249 today, and from 4,560 primates in 1998 to 
6,860 today. He also strengthened the NIRC’s experimental credentials 
and abilities, which has made the centre highly attractive to the phar- 
maceutical industry. 

In the years since Rowell first started working at the facility, support 
for chimp research has slowly eroded around the world. The United 
States stopped importing chimpanzees after signing a 1973 treaty ban- 
ning trade in endangered species. When the AIDS epidemic hit, the 
NIH launched a breeding programme for chimpanzees, but the agency 
declared a moratorium on breeding in 1995, after it became clear that 
chimps were a poor model for the disease. 

Soon, countries started to outlaw chimp research completely. In 
1997, the United Kingdom took that step. Another eight countries 
followed suit in the next decade, and last year, the European Union 
outlawed great-ape experimentation. 

Only one pharmaceutical company, GlaxoSmithKline, has dropped 
chimp research, at least publicly. It announced in 2008 that “the case 
for using great apes in the future is less clear than it may have been 
previously”. 

Opponents of chimp research have painted the United States as an 
outlier for continuing to allow such experiments. That charge irks the 
directors of the chimpanzee centres. Responding to a request from 
Nature, the directors catalogued 27 chimpanzee studies carried out at 
their centres by foreign companies or scientists since 2005. 

“The Europeans did not ban their companies from coming to the 
United States,” says John VandeBerg, director of the Southwest National 
Primate Research Centre in San Antonio, Texas, another of the centres 
that conducts chimp research. “And I can assure you they are not going 
to ban the importation of drugs into their countries that are developed 
using chimpanzees.” 

In the United States, public pressure to shut down the research 
intensified after the television exposé. The show contained video 
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footage obtained surreptitiously by an HSUS investigator who 
infiltrated the NIRC and worked there for nine months. 

In one scene from the resulting documentary, aired by the ABC news 
show Nightline, a sedated chimp fell several feet from a bench to the 
cement floor ofa cage. In another scene, a sedated chimp is carried by 
its arms and legs, not on a stretcher. 

Footage of some of the centre’s monkeys was equally damaging. 
A technician hit a baby on the head after it bit her, and another 
employee rapped a monkey’s teeth with a metal pole. In a different 
scene, an anaesthetized monkey was allowed to fall from a chest-level 
counter to the floor of a lab room. 

The show drew strong reactions. Jane Goodall issued a statement 
saying: “In no lab I have visited have I seen so many chimpanzees 
exhibit such intense fear.” And agriculture secretary Tom Vilsack 
ordered an investigation of the NIRC’s animal-welfare practices. In 
the following 14 months, the centre underwent inspections by two 
units of the US Department of Agriculture (USDA), the NIH’s Office 
of Laboratory Animal Welfare in Bethesda, Maryland, the Association 
for Assessment and Accreditation of Laboratory Animal Care Inter- 
national in Frederick, Maryland, and auditors from every one of the 
NIRC’s pharmaceutical clients. The government agencies also paid 
surprise visits to the other facilities conducting chimpanzee research. 


THE CASE FOR RESEARCH 

In May 2010, the NIRC paid $18,000 to the USDA to settle six alleged 
violations of the Animal Welfare Act, such as leaving sedated adult 
chimps unattended with nursing infants. As part of the agreement, 
the centre neither admitted nor denied that the violations took place, 
or that they were, in fact, violations. But the NIRC has since retrained 
employees in, for instance, keeping animals safe when they are sedated. 

Rowell says that he watched all ten hours of undercover video footage 
that the HSUS turned over to the NIH. He concedes that there were 
moments of carelessness and one case of inappropriate behaviour, when 
the technician hit the infant monkey. But he says that the undercover 
operative — who was working as an aide — bears responsibility for the 
fall involving the sedated monkey. It was her job to protect the animal, 
but she had stepped away to film the room from a distance. Overall, he 
says, “I was proud of what I saw”. 

Now, focused by the IOM study, he is going on the offensive. Others 
are also speaking up, such as Christopher Walker, a hepatitis-C 
researcher based at Nationwide Children’s Hospital in Columbus, Ohio, 
who is the main academic customer at the NIRC. Walker is part of a 
team funded bya five-year, $12.5-million grant from the Bill & Melinda 
Gates Foundation in Seattle, Washington, to try to develop a drug for 
hepatitis C that reinvigorates exhausted immune-system cells called T 
cells; he also relies on NIH grants. 

Walker has not spoken to the press before about his work with 
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chimpanzees; he has been afraid of being targeted by animal-rights 
activists. But he is talking now, he says, “because we are reaching a critical 
decision point”. Walker's work focuses on unravelling the role of cellular 
immunity in hepatitis-C infection, which often leads to liver cancer, a dis- 
ease that is almost always fatal without a liver transplant. While working 
at a firm called Chiron in Emeryville, California, in the 1990s, Walker did 
the scientific groundwork*’ in chimpanzees that led Merck to develop a 
hepatitis-C vaccine programme. Several Merck employees then spun off 
Okairos, a Rome-based biotech, which has since moved a vaccine into 
human trials, after publishing proof-of-concept work in chimpanzees’. 

“The chimpanzees were absolutely critical,” Walker says, in establish- 
ing that immune-system cells called T-cells have an important role in 
controlling the hepatitis-C virus, and that any successful vaccine would 
need to generate a T-cell response. 

Walker is a strong believer that chimpanzee studies continue to be 
needed not only for developing a hepatitis-C vaccine, but also for test- 
ing the safety of new, and potentially risky, medicines to treat both 
hepatitis C and B. He points, for instance, to research published in 2009 
that showed RNA silencing to be effective in controlling hepatitis-C 
infection in chimps’. 

Some others who use chimpanzees see few remaining justifications 
for experiments on the animals. Michael Houghton, a virologist at the 
University of Alberta in Edmonton, and a co-discoverer of the hepati- 
tis-C virus, says that in research related to that virus, “we do not need 
the chimp any more for diagnostic development or for antiviral-drug 
development as we have the infected human available”. The risk-to- 
benefit ratio for infected people in such studies is low enough to justify 
testing in humans, he says. 

Still, Houghton supports chimpanzee use for hepatitis-C vaccine 
development, because vaccines must be tested in 


uninfected individuals. He also supports studies NATURE.COM 
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of millions of humans, it is unethical not to use the chimp model for 
certain indications,’ says Houghton. He also believes that it would be 
unwise not to keep humanely treated chimps available in sanctuaries 
in case of bioterrorist attacks; the animals could be used to study the 
transmission of infectious bioweapons as well as vaccines and therapies, 
he says. 

Activists, though, see no rationale for continuing tests on chimps, 
partly, they say, because ever-more sophisticated in vitro methods make 
it unnecessary*. They also argue that, despite the genetic similarities 
between chimps and humans, they have relevant differences in, for 
instance, immune-response genes’, and that differences in gene expres- 
sion make chimps weak as a biological model. “Stop using the excuse 
that chimps are essential to this research,” says John Pippin, a physician 
who is senior medical and research adviser for the Physicians Commit- 
tee for Responsible Medicine in Washington DC. 

By the end of the year, the IOM committee will offer its own analysis 
of whether chimp research is scientifically warranted. The committee’s 
report will be the most weighty pronouncement on the issue so far in 
the United States, but it may not settle the debate. 

The ongoing controversy has taken a toll on some of those who 
work with chimpanzees. Rowell says he does not take the same 
amount of pleasure in his work that he did five years ago. “’m 
exhausted,” he says. Still, he vows to stay in the job. “It’s not something 
that I do. It's who I am.” mSEE EDITORIAL P.251 


Meredith Wadman is a reporter for Nature based in Washington DC. 
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The key to practical 
quantum computing I 
and high-efficiency 


solar cells may lie in 


the messy green world pentane 
outside the physics lab. n the face of it, quantum effects and living organisms seem Z 
to occupy utterly different realms. The former are usually = 
observed only on the nanometre scale, surrounded by hard = 
vacuum, ultra-low temperatures and a tightly controlled laboratory E 


environment. The latter inhabit a macroscopic world that is warm, 
messy and anything but controlled. A quantum phenomenon such as 
‘coherence, in which the wave patterns of every part of a system stay in 
step, wouldn't last a microsecond in the tumultuous realm of the cell. 
Or so everyone thought. But discoveries in recent years suggest that 
nature knows a few tricks that physicists don't: coherent quantum pro- 
cesses may well be ubiquitous in the natural world. Known or suspected 
examples range from the ability of birds to navigate using Earth’s mag- 
netic field to the inner workings of photosynthesis — the process by 
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which plants and bacteria turn sunlight, carbon dioxide and water into 
organic matter, and arguably the most important biochemical reaction 
on Earth. 

Biology has a knack for using what works, says Seth Lloyd, a physicist 
at the Massachusetts Institute of Technology in Cambridge. And if that 
means “quantum hanky-panky’, he says, “then quantum hanky-panky it 
is”. Some researchers have even begun to talk of an emerging discipline 
called quantum biology, arguing that quantum effects are a vital, if rare, 
ingredient of the way nature works. And laboratory physicists interested 
in practical technology are paying close attention. “We hope to be able 
to learn from the quantum proficiency of these biological systems,’ says 
Lloyd. A better understanding of how quantum effects are maintained 
in living organisms could help researchers to achieve the elusive goal of 
quantum computation, he says. “Or perhaps we can make better energy- 
storage devices or better organic solar cells.” 


ENERGY ROUTEFINDER 

Researchers have long suspected that something unusual is afoot in 
photosynthesis. Particles of light called photons, streaming down from 
the Sun, arrive randomly at the chlorophyll molecules and other light- 
absorbing ‘antenna’ pigments that cluster inside the cells of every leaf, 
and within every photosynthetic bacterium. But once the photons’ 
energy is deposited, it doesn’t stay random. Somehow, it gets chan- 
nelled into a steady flow towards the cell’s photosynthetic reaction 
centre, which can then use it at maximum efficiency to convert carbon 
dioxide into sugars. 

Since the 1930s, scientists have recognized that this journey must be 
described by quantum mechanics, which holds that particles such as 
electrons will often act like waves. Photons hitting an antenna molecule 
will kick up ripples of energized electrons — excitons — like a rock 
splashing water from a puddle. These excitons then pass from one mol- 
ecule to the next until they reach the reaction centre. But is their path 
made up of random, undirected hops, as researchers initially assumed? 
Or could their motion be more organized? Some modern researchers 
have pointed out that the excitons could be coherent, with their waves 
extending to more than one molecule while staying in step and reinforc- 
ing one another. 

If so, there is a striking corollary. Coherent quantum waves can exist 
in two or more states at the same time, so coherent excitons would be 
able to move through the forest of antenna molecules by two or more 
routes at once. In fact, they could simultaneously explore a multitude 
of possible options, and automatically select the most efficient path to 
the reaction centre. 

Four years ago, two teams working under Graham Fleming, a 
chemist at the University of California, Berkeley, were able to obtain 
experimental proof to back up this hypothesis (See ‘Quantum fact 
meets fiction’). One team used a string of very short laser pulses to 
probe the photosynthetic apparatus of the green sulphur bacterium 
Chlorobium tepidium'. The researchers had to chill their samples to 
77 K with liquid nitrogen, but the data from their laser probes showed 
clear evidence of coherent exciton states. The second team carried out 
a similar study of the purple bacterium Rhodobacter sphaeroides’, and 
found much the same electronic coherence operating at temperatures 
up to 180K. 

In 2010, researchers from the first group published evidence of quan- 
tum coherence in their bacterial complex at ambient temperatures’ — 
showing that coherence is not just an artefact of cryogenic laboratory 
conditions, but might actually be important to photosynthesis in the 
real world. Around the same time, a team led by Gregory Scholes, a 

chemist at the University of Toronto in Canada, 
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different light-absorbing chemical groups. 

But how can quantum coherence last long enough to be useful in 
photosynthesis? Most physicists would have assumed that, at ambient 
temperatures, the surrounding molecular chaos in the cell destroys the 
coherence almost instantly. 

Computer simulations carried out by Lloyd and some of his col- 
leagues suggest an answer: random noise in the environment might 
actually enhance the efficiency of the energy transfer in photosynthesis 
rather than degrade it’. It turns out that an exciton can sometimes get 
trapped on particular sites in the photosynthetic chain, but simulations 
suggest that environmental noise can shake it loose gently enough to 
avoid destroying its coherence. In effect, says Lloyd, “the environment 
frees up the exciton and allows it to get to where it’s going”. 

Photosynthesis is not the only example of quantum effects in nature. 
For instance, researchers have known for several years that in some 
enzyme-catalysed reactions’, protons move from one molecule to 
another by the quantum-mechanical phenomenon of tunnelling, in 
which a particle passes through an energy barrier rather than having 
to muster the energy 
to climb over it. And 
a controversial theory 
of olfaction claims that 
smell comes from the 
biochemical sensing of 
molecular vibrations — 
a process that involves 
electron tunnelling between the molecule responsible for the odour 
and the receptor where it binds in the nose’. 

Are such examples widespread enough to justify a whole new dis- 
cipline, though? Robert Blankenship, a photosynthesis researcher at 
Washington University in St Louis, Missouri, and a co-author with 
Fleming on the C. tepidium paper, admits to some scepticism. “My 
sense is that there may well be a few cases, like the ones we know about 
already, where these effects are important,” he says, “but that many, 
if not most, biological systems will not utilize quantum effects like 
these.” But Scholes believes that there are grounds for optimism, given 
a suitably broad definition of quantum biology. “I do think there are 
other examples in biology where an understanding at the quantum- 
mechanical level will help us to appreciate more deeply how the process 
works,” he says. 


Nature knows a 
few tricks that 
physicists don’t. 


THE BIRD’S-EYE COMPASS 

One long-standing biological puzzle that might be explained by exotic 
quantum effects is how some birds are able to navigate by sensing Earth's 
magnetic field. 

The avian magnetic sensor is known to be activated by light striking 
the bird’s retina. Researchers’ current best guess at a mechanism is that 
the energy deposited by each incoming photon creates a pair of free 
radicals* — highly reactive molecules, each with an unpaired electron. 
Each of these unpaired electrons has an intrinsic angular momentum, or 
spin, that can be reoriented by a magnetic field. As the radicals separate, 
the unpaired electron on one is primarily influenced by the magnetism 
of a nearby atomic nucleus, whereas the unpaired electron on the other 
is further away from the nucleus, and feels only Earth’s magnetic field. 
The difference in the fields shifts the radical pair between two quantum 
states with differing chemical reactivity. 

“One version of the idea would be that some chemical is synthesized” 
in the bird’s retinal cells when the system is in one state, but not when 
it’s in the other, says Simon Benjamin, a physicist at the University of 
Oxford, UK. “Its concentration reflects Earth’s field orientation.” The 
feasibility of this idea was demonstrated in 2008 in an artificial photo- 
chemical reaction, in which magnetic fields affected the lifetime of a 
radical pair’. 

Benjamin and his co-workers have proposed that the two unpaired 
electrons, being created by the absorption of a single photon, exist in 


16 JUNE 2011 | VOL 474 | NATURE | 273 


© 2011 Macmillan Publishers Limited. All rights reserved 


| NEWS FEATURE 


a state of quantum entanglement: a form of coherence in which the 
orientation of one spin remains correlated with that of the other, no 
matter how far apart the radicals move. Entanglement is usually quite 
delicate at ambient temperatures, but the researchers calculate that it 
is maintained in the avian compass for at least tens of microseconds 
— much longer than is currently possible in any artificial molecular 
system”. 

This quantum-assisted magnetic sensing could be widespread. Not 
only birds, but also some insects and even plants show physiological 
responses to magnetic fields — for example, the growth-inhibiting 
influence of blue light on the flowering plant Arabidopsis thaliana is 
moderated by magnetic fields in a way that may also use the radical- 
pair mechanism”. But for clinching proof that it works this way, says 
Benjamin, “we need to understand the basic molecules involved, and 
then study them in the lab”. 


SELECTED BENEFITS 

Quantum coherence in photosynthesis seems to be beneficial to the 
organisms using it. But did their ability to exploit quantum effects evolve 
through natural selection? Or is quantum coherence just an accidental 
side effect of the way certain molecules are structured? “There is a lot 
of speculation about the evolutionary question, and a lot of misunder- 
standing, says Scholes, who is far from sure about the answer. “We 
cannot tell if this effect in photosynthesis is selected for, nor if there is 
the option not to use coherence to move the electronic energy. There 
are no data available at all even to address the question” 

He points out that it isn’t obvious why selection would favour coher- 
ence. “Almost all photosynthetic organisms spend most of the day trying 
to moderate light-harvesting. It is rare to be light-limited. So why would 
there be evolutionary pressure to tweak light-harvesting efficiency?” 


A novel idea 
" 


Quantum fact meets fiction 


“How your average leaf transfers energy 
from one molecular system to another 
is nothing short of a miracle ... Quantum 
coherence is key to the efficiency, you 
see, with the system sampling all the 
energy pathways at once. And the way 
nanotechnology is heading, we could 
copy this with the right materials.” 

These words are lifted from the pages 
of lan McEwan’s novel Solar Jonathan 
Cape, 2010), which describes the 
tragicomic exploits of physicist Michael 
Beard, a Nobel laureate and philanderer, as he misappropriates an 
idea for a solar-driven method to split water into its elements. 

“| wanted to give him a technology still on the lab bench,” 
says McEwan, who has previously scattered science through his 
books Enduring Love (1997) and Saturday (2005). He came across 
research into quantum photosynthesis by Graham Fleming, a 
chemist at the University of California, Berkeley, and decided that it 
was just what he needed. He fit the idea in with Beard’s supposed 
work in quantum physics with the help of Graeme Mitchison, 

a physicist at the University of Cambridge, UK, who reverse- 
engineered the Nobel citation for Beard that appears in Solar’s 
appendix, and reads, “Beard’s theory revealed that the events 
that take place when radiation interacts with matter propagate 
coherently over a large scale compared to the size of atoms.” PB. 
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Fleming agrees: he suspects that quantum coherence is not adaptive, but 
is simply “a by-product of the dense packing of chromophores required 
to optimize solar absorption” Scholes hopes to investigate the issue by 
comparing antenna proteins isolated from species of cryptophyte algae 
that evolved at different times. 

But even if quantum coherence in biological systems is a chance 
effect, adds Fleming, its consequences are extraordinary, making sys- 
tems insensitive to disorder in the distribution of energy. What is more, 
he says, it “enables ‘rectifier-like’ one-way energy transfer, produces the 
fastest [energy-transfer] 
rate, is temperature- 
insensitive and prob- 
ably a few other things I 
haven't thought of”. 

These effects, in turn, 
suggest practical uses. 
Perhaps most obvi- 
ously, says Scholes, a 
better understanding of 
how biological systems 
achieve quantum coher- 
ence in ambient conditions will “change the way we think about design 
of light-harvesting structures”. This could allow scientists to build tech- 
nology such as solar cells with improved energy-conversion efficiencies. 
Seth Lloyd considers this “a reasonable expectation’, and is particularly 
hopeful that his discovery of the positive role of environmental noise 
will be useful for engineering photonic systems using materials such as 
quantum dots (nanoscale crystals) or highly branched polymers stud- 
ded with light-absorbing chemical groups, which can serve as artificial 
antenna arrays. 

Another area of potential application is in quantum computing. The 
long-standing goal of the physicists and engineers working in this area 
is to manipulate data encoded in quantum bits (qubits) of information, 
such as the spin-up and spin-down states of an electron or of an atomic 
nucleus. Qubits can exist in both states at once, thus permitting the 
simultaneous exploration of all possible answers to the computation 
that they encode. In principle, this would give quantum computers the 
power to find the best solution far more quickly than today’s computers 
can — but only if the qubits can maintain their coherence, without the 
noise of the surrounding environment, such as the jostling of neigh- 
bouring atoms, destroying the synchrony of the waves. 

But biology has somehow solved that challenge: in effect, quantum 
coherence allows a photosystem to perform a ‘best-path’ quantum 
computation. Benjamin, whose main interest is in designing materials 
systems for quantum computation and information technology, sees 
the ambient-temperature avian compass as a potential guide. “If we can 
figure out how the bird’s compass protects itself from decoherence, this 
might just give us a few clues in the quest to create quantum technolo- 
gies,’ he says. Learning from nature is an idea as old as mythology — but 
until now, no one has imagined that the natural world has anything to 
teach us about the quantum world. = 


“This might just 
give us a few clues 
in the quest to 
create quantum 
technology.” 


Philip Ball is a writer based in London. 
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Buried by bad decisions 


Our brains are hard-wired to make poor choices about harm prevention 
in today’s world. But we can fight it, says Daniel Gilbert. 


r | Nhe London Association for the 
Prevention of Premature Burial was 
founded in 1896 to prevent “prema- 

ture burial generally, and especially amongst 

the members”’. Because nineteenth-century 
physicians couldn't always distinguish the 
nearly dead from the really most sincerely 
dead, premature burial was a problem. 

But not a big problem. The odds of being 

buried alive in 1896 were, like the odds of 

being buried alive today, very close to zero. 

Nonetheless, the good citizens of England 

formed action committees, wrote editorials 

and promoted legislation that ultimately led 
to expensive safeguards against “the horrible 
doom of being buried alive”’. Most of those 


safeguards — such as the costly requirement 
that bodies spend time in ‘attractive waiting 
mortuaries’ before being buried — are still 
with us today. The frequency with which 
modern cadavers use this waiting period to 
demonstrate that they've been misdiagnosed 
is approximately never. 

Premature burial isn't a big problem, but 
the way we deal with big problems is. When 
an aeroplane’ fuselage rips open mid-flight, 
or an offshore oil rig 


explodes, oranuclear NATURE.COM 
power plant is crip- Can decision- 
pled by atsunami, we — makingbe 
immediately ask what _ taught? 

could have been done __go.iaftire.com/ykpugo 
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differently, blame those who didn't do it, then 
allocate funds and pass legislation to make 
sure it gets done that way the next time. At 
first blush, this seems sensible. After all, no 
one is in favour of aviation accidents, reactor 
meltdowns or oil spills; so when these things 
happen, why not do everything we can to 
make sure they dont happen again? 

The answer is that because resources are 
finite, every sensible thing we do is another 
sensible thing we dont. Alas, research shows 
that when human beings make decisions, 
they tend to focus on what they are get- 
ting and forget about what we are forgoing. 
For example, people are more likely to buy 
an item when they are asked to choose > 
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> between buying and not buying it than 
when they are asked to choose between buy- 
ing the item and keeping their money “for 
other purchases’. Although “not buying” and 
“keeping one’s money” are the same thing, 
the latter phrase reminds people of some- 
thing they know but typically fail to con- 
sider: buying one thing means not buying 
another. So should we do everything in our 
power to stop global warming? To make sure 
terrorists don't board aeroplanes? To keep 
Escherichia coli out of the food supply? These 
seem like simple questions with easy answers 
only because they describe what we will 
do without also describing what we wont. 
When both are made explicit — should we 
keep hamburgers safe or aeroplanes safe? 
— these simple questions become vexing. 
Harm prevention often seems like a moral 
imperative, but because every yes entails a 
no, it is actually a practical choice. 

How are we to make that choice? In the 
seventeenth century, Blaise Pascal and Pierre 
de Fermat derived the optimal strategy for 
betting on games of chance, and in the pro- 
cess demonstrated that wise choices about 
harm prevention are always the product 
of two estimates: an estimate of odds (how 
likely is the harmful event?) and an estimate 
of consequences (how much harm will it 
cause?). If we know which harm is most 
likely and which harm is most severe, then 
we know which harm to prevent. We should 
spend less to prevent a natural disaster that 
will probably leave 3,000 people homeless 
than a communicable disease that will cer- 
tainly leave 3 million people dead, and this 
is perfectly obvious to everyone. 

Except when it isn't. 


ANCIENT MINDS 
The reason it took a pair of mathematical 
geniuses to develop a formula for rational 
choice is that human beings often don't 
make choices that way. When left to our 
own devices, we will pay more to eliminate 
a small risk of illness than to reduce a large 
one’, and more to insure ourselves against a 
scary way of dying than against every way of 
dying’. We will save all the members ofa five- 
person group before we will save six mem- 
bers of a ten-person group’, and we will save 
lives by pushing a trolley into a person but 
nota person into a trolley’. Our brains were 
optimized for finding food and mates on the 
African savannah and not for estimating the 
likelihood ofa core breach or the impact of 
overfishing. Nature has installed in each of us 
a threat-detection system that is exquisitely 
sensitive to the kinds of threats our ancestors 
faced — a slithering snake, a romantic rival, 
a band of men waving sticks — but that is 
remarkably insensitive to the odds and con- 
sequences of the threats we face today. 

For example, our brains devote a great 
deal of time and real estate to processing 
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information about other people — about what 
they think, know, want and intend. Because 
we specialize in understanding other minds, 
we are hypersensitive to the harms those 
minds produce. When people play economic 
games, for instance, they tend to reject unfair 
offers from their opponents — but they are 
much more likely to do so when their oppo- 
nent is a person than when their opponent 
is acomputer’. When people receive electric 
shocks, they describe them as considerably 
more painful when they are intentionally 
administered by a human agent’. It is bad to 
be harmed, but it is worse to be victimized. 
And so we worry more about shoe-bombers 
than influenza, despite 
the fact that one kills 


“We will 
roughly 400,000 people 
ee eas per year and the other 
ji kills roughly none. We 
achild but worry more about our 
not our light children being kid- 
ae hed napped by strangers 

em ail. 


than about becoming 
obese, despite the fact 
that abduction is rare and diabetes is not. Ter- 
rorists and child-molesters are agents, viruses 
and French fries are objects, and agents 
threaten us in a way that objects never can. 
We are especially concerned when the 
threats those human agents produce are to 
our dignity, values and honour. Moral rules 
bind communities together, enable trust and 
the division of labour and cause people to 
behave honestly when no one is watching. 
Because these rules have such a crucial role 
in the formation and functioning of human 
social groups, we are obsessed with their 
violation, which is why US Weekly outsells 
The New Yorker. Unfortunately, when a tribe 
grows to nearly 7 billion people, threats to 
its sense of decency are not the most serious 
threats it faces. Climate change is caused by 
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the burning of fossil fuels, not flags. Because 
a decision to prevent one kind of harm is 
always a decision not to prevent another, 
the irresistible lure of moral violations can 
distract us from more crucial concerns. 


MORALS TO DIE FOR? 

Our obsession with morality can also 
discourage us from embracing practical 
solutions to pressing problems. The taboo 
against selling our bodies means that people 
who have money and need a kidney must 
die so that people who need money and have 
a spare kidney can starve. Economic mod- 
els suggest that drug abuse would decline if 
drugs were taxed rather than banned’, but 
many people have zero tolerance for policies 
that permit immoral behaviour even if they 
drastically reduce its frequency. Licensing 
prostitutes, trading pollution credits and 
paying students to stay in school may or may 
not reduce harm, but many would oppose 
these ideas even if they were proved effective. 
It is apparently better for people to suffer 
and die than to get the wrong message. 

Our species’ sociality has always been 
its greatest advantage, but it may also be its 
undoing. Because we see the world through 
a lens of friends and enemies, heroes and 
villains, alliances and betrayals, virtue and 
vice, credit and blame, we are riveted by the 
dramas that matter least and apathetic to the 
dangers that matter most. We will change 
our lives to save a child but not our light 
bulbs to save them all. 

What are we to do about the mismatch 
between the way we think and the problems 
we should be thinking about? One solution 
is to frame problems in ways that appeal to 
our nature. For example, when threats are 
described as moral violations, apathy often 
turns to action. Texas highways were awash 
in litter until 1986, when the state adopted 


a slogan — ‘Don't mess with Texas’ — that 
made littering an insult to the honour of 
every proud Texan, at which point littering 
decreased by 72% (ref. 8). Hotels wasted sig- 
nificant amounts of energy washing barely- 
used towels until 2008, when researchers 
placed signs in hotel rooms that either asked 
guests to “help save the environment by 
reusing your towels” or told guests that “75% 
of the guests who stayed in this room partici- 
pated in our new resource savings program 
by using their towels more than once”’. The 
second sign suggested that laundering a 
barely-used towel was a violation of a moral 
rule that most people obeyed, and that sign 
increased towel reuse by 33%. Psychologists 
and economists have found dozens of ways 
to make problems easier to think about and 
harder to ignore. There is no shortage of 
solutions, just of the will to implement them. 

The other way to deal with the mismatch 
between the threats we face and the way we 
think is to change the way we think. People 
are capable of thinking rationally about 
odds and consequences, and it isn’t hard to 
teach them. Research shows that a simple 
five-minute lesson dramatically improves 
people’s decision-making in new domains 
a month later’’, and yet that is five minutes 
more than most people ever get. We teach 
high-school students how to read Chaucer 
and do trigonometry, but not how to think 
rationally about the problems that could 
extinguish their species. 

Psychologists have made remarkable 
progress in understanding how decision- 
making goes wrong and how it can be set 
right, and although their research generates 
bestselling books and garners Nobel Prizes, 
funding agencies typically give it low prior- 
ity. Our communal fate rests on decisions 
that could easily be improved, if only we 
would decide to do so. It is our way of think- 
ing, and not the undertaker, that threatens to 
bury us prematurely. m 
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Democratizing 
clinical research 


Keith Lloyd and Jo White commend a way for patients, 
clinicians and scientists to set priorities jointly. 


esearch priorities are rarely set 
R democratically. Whereas clinical 
science is largely about establish- 
ing which treatments work best for whom, 
sadly, the views of those with most to gain 
or lose — patients — are generally ignored. 
Academics, industry and other big players 
with vital roles in developing treatments 
tend to set the agenda. But their priorities 
differ from those of patients and clinicians. 
For example, the outcomes measured ina 
trial of a drug may not be those of interest 
to the people who will actually take it. 
The inclusion of patient demands is not 
a panacea. It can divert scarce research 
resources and delay important treatments’. 
One solution is to try to harmonize the 
perspectives of patient and clinician. This 
is what the James Lind Alliance (JLA) Pri- 
ority Setting Partnerships in Oxford, UK, 
attempt, perhaps uniquely. Established 
in 2004 and funded by the UK Medical 
Research Council and National Institute 
for Health Research (NIHR), the JLA 


brings together patients, carers and clini- 
cians to identify and rank questions about 
the effects of treatments for a given disease. 
Clinicians and academics — who may never 
meet patients — find long-held beliefs chal- 
lenged and sometimes overturned. 

The JLA process has recently been 
applied to schizophrenia — a mental illness 
affecting about one person in a hundred 
worldwide. We were involved in this exer- 
cise as clinical academics. This, plus our 
experience as recipients of grants and from 
within funding bodies, convinces us that 
money rarely goes to the studies that those 
with mental illness would choose. We 
therefore urge funders to adopt this list of 
top priorities for schizophrenia (see “Top 
ten treatment uncertainties’), and entreat 
other countries and organizations to use 
the technique involved in compiling it to 
steer other clinical research. 

Between 2007 and 2009, we and other 
collaborators from the JLA Partnership 
collated 489 potential uncertainties about 


SCHIZOPHRENIA RESEARCH PRIORITIES 


Top ten treatment uncertainties 


1. What is the best way to treat people 
with schizophrenia that is unresponsive 
to treatment? 


2. What training is needed to recognize 
the early signs of recurrence? 


3. Should there be compulsory 
community outpatient treatment for 
people with severe mental disorders? 


4. How can sexual dysfunction due 
to antipsychotic-drug therapy be 
managed? 


5. What are the benefits of supported 
employment for people with 
schizophrenia in terms of quality 

of life, self esteem, long-term 
employment prospects and illness 
outcomes? 


6. Do the adverse effects of antipsychotic 
drugs outweigh the benefits? 


7. What are the benefits of hospital 
treatment compared with home care for 
psychotic episodes? 


8. What are the clinical benefits and cost- 
effectiveness of monitoring the physical 
health of people with schizophrenia? 


9. What are the clinical, social and 
economic outcomes — including quality 
of life and the methods and effects of 
risk monitoring — of treatment by acute 
day hospitals, assertive outreach teams, 
in-patient units, and crisis resolution and 
home treatment teams? 


10. What interventions could reduce 
weight gain in schizophrenia? 


Some treatment uncertainties have been reformulated here as questions. 
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the treatment of schizophrenia. These came 
from clinicians, patients and their carers 
through web- and paper-based question- 
naires. We also pulled them from the UK 
Database of Uncertainties about the Effects 
of Treatments, which contains instances in 
which “no up to date systematic reviews 
exist, or up-to-date systematic reviews show 
that uncertainty continues”. 

These questions were de-duplicated to 
produce a longlist of 237 issues. Eleven 
schizophrenia partners — carers, clinicians, 
patients, funders and voluntary-sector organ- 
izations — each ranked their top ten uncer- 
tainties. These partners responded either as 
individuals, or on behalf of an organization, 
having consulted colleagues and members. 

The partnership collated the rankings, 
recording separate running totals for patient, 
carer and clinician submissions. This ena- 
bled a steering group — a subset of the part- 
ners — to examine each individual ranking, 
as well as the combined ranking, to produce 
a pooled list of 26 treatment uncertainties. 

Finally, this list was discussed at an 
exhilarating workshop of clinicians, car- 
ers, patients, funders and voluntary-sector 
organizations in January. The JLA facilitated 
the meeting using a structured variation of 
small-group discussion called ‘nominal group 
technique (see go.nature.com/xswwtc) to 
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reach moderated consensus on a top ten. 

The process prevented one person domi- 
nating the discussion and encouraged all 
group members to participate. The format 
was rigorous, but flexible enough to allow 
people to revise their opinions, raise con- 
cerns and to reach consensus about any 
imbalance perceived to have emerged from 
the interim stages. 

Although the purpose of the JLA process” 
is to enable patients and those who treat 
them to have a say in what gets studied, it can 
also change clinical practice. For example, 
sexual dysfunction caused by antipsychotic 
medication emerged as a key patient priority. 
This is typically a low priority for clinicians 
prescribing medication and for companies 
assessing drug effectiveness. 

The week after the JLA workshop, a patient 
came to see one of us (K.L.) in a clinic, and 
wanted a change of antipsychotic medication 
because of sexual dysfunction. Without the 
experience of the JLA process, it is unlikely 
that this issue would have been afforded as 
much weight as it was. 

The final top ten for schizophrenia is 


noteworthy for its divergence from the 
agenda of the drug industry, and begs many 
questions. Perhaps most pressing: is it ethi- 
cal to conduct research, which may include 
testing new treatments, without considering 
which outcomes matter most to those who 
will receive the treatment? And is it, in the 
long run, to drug companies’ benefit to do 
so? Such questions are particularly pertinent 
in conditions such as schizophrenia, in which 
the balance of power between researcher, 
clinician and patient is so uneven. 

What next? The team will repeat the 
exercise for depression this year and next. 
Meanwhile, the JLA is encouraging funders 
and researchers to act on the top ten rather 
than to continue with agendas devoid of 
clinician and patient input. For example, 
the NIHR is now exploring commissioning 
research on weight gain and sexual dysfunc- 
tion in schizophrenia. Assumptions that 
“researcher knows best” have had their time. = 
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Potent tiny 


packages 


Carl Zimmer’s primer on viruses entertains, but reveals 
little about their basic traits, says Robin Weiss. 


Rods of tobacco mosaic virus were first seen in 1939 using an electron microscope. 
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iruses propagate in every kind of 
\ / living organism and, despite being 
so tiny, amount to around 5% of the 
world’s biomass. Viruses that infect algal 
blooms, for example, induce the calcifica- 
tion of their hosts and are thus responsible 
for the white cliffs of Dover in Britain, and 
other carbon sinks. Acclaimed science writer 
Carl Zimmer celebrates the versatility of these 
agents in a dozen entertaining essays in his 
latest book, A Planet of Viruses, which accom- 
panies the World of Viruses educational 
website (www.worldofviruses.unl.edu). 

Each essay deals with a different virus, 
ranging from those in bacteria and plants 
to scourges such as smallpox, severe acute 
respiratory syndrome (SARS), influenza and 
West Nile virus. There is food for thought 
here for all, even a seasoned virologist like 
me. Zimmer's writing grabs one’s interest, 
but is marred by a lack of attention to detail. 
It would bea more attractive little volume if it 
were half as long again, because many of the 
essays end just when they become interesting. 

Zimmer begins with the observation in 
1898 by the Dutch microbiologist Martinus 
Beijerinck that the tobacco mosaic disease 
of plants was caused by “a contagious liv- 
ing fluid”. Dmitry Josifovich Ivanovsky is 
ignored in the account — even though he 
had also isolated and propagated tobacco 
mosaic virus from filtered plant sap six years 
earlier — because he thought it must be a 
bacterial disease. The property of transmis- 
sion through a filter too fine for bacteria to 
pass through became the defining feature 
of a virus, thanks to Beijerinck but also to 
Friedrich Loeffler and Paul Frosch, who 
reported the filterable nature of foot-and- 
mouth disease in the same year. 

Zimmer omits the marvellous subsequent 
discoveries using tobacco mosaic virus: 
its crystallization from the ‘living fluid’ by 
Wendell Stanley in 1935; its composition as 
protein wrapped around RNA by Norman 
Pirie and Frederick Bawden in 1936; the 
first visualization of virus particles through 
the electron microscope by Gustav Kausche, 
Edgar Pfannkuch and Helmut Ruska in 
1939; and the first demonstration, by Heinz 
Fraenkel-Conrat in 1955, that RNA alone can 
reconstitute infection. 
One of the most fasci- 
nating tales of twenti- 
eth-century science is 
how viruses opened up 
molecular biology, but 
you wont find it here. 

This year marks the 
centenary of Peyton 
Rous’s demonstra- 
tion that a cancer 
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> Horns’ recalls Rous’ pioneering work 
on tumour viruses, before introducing 
Richard Shope’s discovery of the rabbit 
papilloma viruses, to which the essay title 
refers. Harald zur Hausen and colleagues 
discovered human cervical papilloma 
viruses in 1983; the vaccines that target 
them to protect women against cervical 
cancer were licensed in 2006. Thus, 
100 years of tumour virology paid off. 
Rous may have had to wait longer than 
any other Nobel laureate to win his prize 
in 1966, whereas zur Hausen’s ‘incubation 
period’ was a mere 25 years. 

Owing to Zimmer’s puzzling reluc- 
tance to delve into molecular virology, 
we have to wait until the end of the last 
and best essay, on the giant mimiviruses 
— discovered only in 1992 — to learn 
that some viruses have RNA genomes 
instead of DNA. No other replicating 
systems carry their genes in the form of 
RNA, as do polio, measles, influenza and 
most plant viruses. Some viruses have 
double-stranded RNA, whereas others 
are single-stranded; some viruses carry 
a single RNA or DNA molecule, and 
others have segmented genomes like 
the different chromosomes of higher 
organisms. Neither of the two chapters 
on retroviruses mentions reverse tran- 
scription — by which the RNA genome 
is turned into DNA before inserting itself 
into host DNA — even though the most 
potent anti-HIV drugs are designed to 
block this process. 

Perhaps Zimmer thinks such facts are 
too difficult for his readership, but I view 
avoiding them as dumbing down. Which 
viruses evolved from bacteria, and which 
are more likely to have emerged as sets of 
genes that escaped from their hosts? Are 
some viruses relics from an RNA-based 
world, or are they relatively modern 
parasites derived from other living sys- 
tems? Zimmer eventually raises the last 
question, but to my mind, the fascination 
of viruses is their enormous molecular 
and evolutionary diversity as much as 
their pervasiveness in the environment. 

Concern for accuracy seems to have 
suffered as Zimmer becomes an ever 
more prolific writer. Some virologists’ 
names are wrong, for instance, as are 
some other simple facts. 

Ina foreword to Peter Medawar’s 1996 
collection of essays, Stephen Jay Gould 
called this literary form “a weapon of wit 
and instruction” It would be difficult 
for any science writer to match Meda- 
war or Gould; nevertheless, Zimmer’s 
contributions fit this definition well. m 
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Siphonozoid- 
Another kind 
of polyp (?) 


A mushroom coral and one of its polyps drawn in the field by science illustrator Jenny Keller. 


TECHNIQUES 


Records in the field 


Good notebook skills are vital for documenting 
observations of the natural world, finds Sandra Knapp. 


romance, danger, excitement. There is 

a thrill to fieldwork that makes lab- 
based scientists ask “How was your holiday?” 
when one returns from a stint outside. Many 
books have been written about the explor- 
ers of the past, transcribing their logs and 
journals, or fictionalizing their adventures. 
This volume is refreshingly different. Biolo- 
gist Michael Canfield has compiled a set of 
essays not on researchers travels, but on how 
they capture their experiences in their notes. 

Field Notes on Science & Nature is an eclec- 
tic collection that crosses many disciplines, 
from geology, botany and zoology to art 
and anthropology. The variety of styles and 
records described are fascinating — field 
notes are very personal. Some of the con- 
tributors take notes entirely electronically, 
others in red pen in cheap notebooks. Others 
use pictures more than words. 

Few of us have the artistic skills of Jonathan 
Kingdon or Jenny Keller, scientist-illustrators 
whose drawings alone make this book worth 
buying. But even the sketchiest sketch can 
call to mind a place or organism in a way no 
words can. I remember the field books of 
a friend with whom I worked in the tropi- 

cal forests of Central 


f ield biology: the very words conjure up 
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For more on mixture of description, 
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seemingly chaotic col- 
lections evoke those 
places far better than 
my own lists. I learned 
from this, and started 
to sketch the plants 


I collected —flower MSiowee nats 
and leaf shapes, plant 

forms and outlines y 
appeared in my pages. r 


Field Notes on 
Science & Nature 
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CANFIELD 
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The tradition 
has a long pedi- 
gree, encompassing 
notebook sketches 
by the great Victo- 
rian naturalists. My 
favourites are those of 
Henry Walter Bates, Alfred Russel Wallace 
and Richard Spruce, early evolutionists who 
mused on the page about why, as well as what 
and where. Keller's advocacy of standardized 
colour palettes in her essay harks back to the 
methods of eighteenth-century illustrator 
Sydney Parkinson, who accompanied Joseph 
Banks on Captain James Cook’s voyage on 
HMS Endeavour, or the Austrian Bauer 
brothers, one of whom accompanied Captain 
Matthew Flinders on HMS Investigator a few 
decades later. 

Parkinson drew and painted all of the 
plants that were collected, but for efficiency 
only coloured part of each (a practice recom- 
mended by Keller). He died on the voyage, 
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but his work was enough to enable the 
publication of the entire collection two 
centuries later as a series of coloured plates. 
Ferdinand Bauer’s sketches of plants and 
animals of Australia were intricately labelled 
with numbers indicating colours; it was only 
in the twentieth century that the key to the 
colours was discovered, deep in the collec- 
tions in Madrid. His brother, Franz, used the 
same key in botanical paintings he made at 
Kew, near London. 

Accuracy and speed of capture of the 
image are just as important now. But digital 
photography has not obviated the need for 
field sketches. As many contributors point 
out, a sketch can be labelled on the spot 
and does not require printers, cameras and 
other electronic hardware to be carried to 
remote places. 

Whether notes are telegraphic or detailed, 
akey to abbreviations is a must. Making field 
notes directly on the computer can solve the 
transcription problem, as one only has to 
enter information once and typed text is easy 
to read, say entomologist Piotr Naskrecki 
and plant biologist Jim Reveal. But, Reveal 
adds, computerized notes lack the person- 
ality so apparent in handwritten accounts. 

Illustrated field notes can provide the 
basis for public conversations on science. 
For example, anthropologist Karen Kramer's 
sketch maps of Mayan villages aided her 
research into how the villages functioned 
because local people were happy to talk about 
her interpretations of their space. And orni- 
thologist Kenn Kaufman describes the species 
lists made through the eBird project, which 
records birders’ observations via a website. 
This crowd-sourcing method of taking field 
notes is an extension of the ‘bioblitz’ concept, 
in which members of the public list all the 
species they encounter over a short period. 

It is disturbing to observe, as ecologist 
Erick Greene does in his essay on best prac- 
tice, that today’s generation of field biolo- 
gists do not keep notes as diligently as their 
laboratory-based counterparts. Lab books 
are retained as permanent records (some- 
times drawn upon in cases of scientific mis- 
conduct), whereas field notebooks are rarely 
archived. Yet they record observations that 
might seem trivial at the time, but on reflec- 
tion become the basis for new insight. As 
ecologist Bernd Heinrich rightly says, notes 
from the field often represent a search for 
problems, not solutions. Who knows whose 
field notebooks now contain observations 
that will change the world? 

I will alter my own note-taking after read- 
ing this set of essays. All scientists, whether 
based in the field or the lab, could benefit 
from the advice given here so eloquently. m 


Sandra Knapp is a botanist at the Natural 
History Museum, London SW7 5BD, UK. 
e-mail: s.knapp@nhm.ac.uk 
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Books in brief 


The Quest for the Cure: The Science and Stories Behind the Next 
Generation of Medicines 

Brent R. Stockwell COLUMBIA UNIVERSITY PRESS 284 pp. $27.95 (2011) 
In the past 50 years, we have developed drugs to cure many major 
diseases. But treatments for some serious conditions, such as cancer 
and Alzheimer’s, still elude us. Chemical biologist Brent Stockwell 
describes the history of drug design, from the invention of mustard 
gas and early anti-cancer agents to the decoding of the human 
genome. Countering the pessimists who fear that the end is nigh for 
significant breakthroughs, he argues that emerging technologies for 
drug testing and molecular modelling will open up new avenues. 


Dog Sense: How the New Science of Dog Behavior Can Make You 
A Better Friend to Your Pet 

John Bradshaw BASIC Books 352 pp. $25.99 (2011) 

Although dogs are loved by many, their lot is not always a happy 
one. Originally bred as rural working animals, most dogs now live in 
cities where they are expected to be more obedient than any child. 
The perpetuation of pedigrees also mars the health of some breeds. 
Anthrozoologist John Bradshaw summarizes what science can 
teach us about man’s best friend. Arguing that modern dogs should 
not be considered domesticated wolves, he asks how we can best 
breed these social animals to be companions and family pets. 


Eruptions that Shook the World 

Clive Oppenheimer CAMBRIDGE UNIVERSITY PRESS 408 pp. £18.99 (2011) 
Closures of international airspace after the recent Icelandic 
eruptions served as a reminder that volcanoes can be disruptive. 
But volcanic outbursts have also shaped our history, from aiding 
the demise of the dinosaurs to altering climate. Ash ejected into the 
atmosphere may even have led to the meagre harvest that triggered 
the French Revolution. Volcanologist Clive Oppenheimer relates in 
rigorous detail the consequences of eruptions over the past quarter 
of a billion years, and argues that lessons can be learned for future 
risk management of catastrophes. 


The Ripple Effect: The Fate of Freshwater in the Twenty-First 
Century 

Alex Prud’homme SCRIBNER 448 pp. $27 (2011) 

Flooding and drought are both on the rise. Journalist Alex 
Prud’homme digs into the reasons why, citing centuries of neglect of 
water infrastructure and a careless attitude to issues of water quality 
and use, ownership and waste. Focusing on issues that threaten 
clean and abundant water in the United States, he travels across 

the country to speak to people at the centre of the drama, including 
salmon fishermen and copper miners in Alaska and scientists 
investigating intersex fish in Chesapeake Bay. 
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The Fallacy of Fine-Tuning: Why the Universe is Not Designed For Us 
Victor J. Stenger PROMETHEUS 345 pp. £24.95 (2011) 

The Universe seems to be fine-tuned, with precisely set parameters 
that allow life to exist as a rare event. This idea has been used by 
some to argue that humans have a central place in the cosmos, and 
even as evidence for the existence of God. Physicist Victor Stenger 
rails against this ‘fallacy’ by dismantling such assumptions one 

by one. The laws of physics and cosmology constrain some key 
numbers, he says, and others are not as fine-tuned or as improbable 
as proponents of the idea suggest. 
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Q&A Baba Brinkman 
The adaptive lyricist 


Baba Brinkman is a Canadian rap artist whose award-winning show The Rap Guide to 
Evolution wowed UK crowds at the Edinburgh Fringe Festival during Charles Darwin's 
bicentenary year. As the show opens next week for a long summer run off Broadway in New 
York, Brinkman discusses rhyme, improvisation and scientific certainty. 


Why rap about science? 

Science chose me. I have a master’s degree 
in medieval literature and I had done a hip- 
hop show based on The Canterbury Tales. 
Mark Pallen, a bacteriologist at the Univer- 
sity of Birmingham, UK, heard it and got it 
into his head that the next one should be a 
rap of On the Origin of Species. I took up the 
gauntlet. Science rapping is not the reason I 
got into rap, but I’ve found myself evolving 
into that niche. 


Is it a comfortable niche? 
Yes, science and rap go well together. Rap, in 
essence, is about speaking with conviction 
— and you can usually be certain of things 
that have been scientifically validated. I've 
upped my swagger since I’ve been writing 
about science. In my songs from five or 
six years ago, when I 


© NATURE.COM was immersed in the 
Listen to Baba humanities, I was a 
Brinkman on the big equivocator. Not 
Nature Podcast: that I’m trying to con- 


go.nature.com/qeaijm vey absolute certainty, 
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because science is 
about uncertainty and 
exploration. 


The Rap Guide 
to Evolution 


BA BRINK 


LOOMIS. 


Do researchers 


SoHo Playhouse, 

New York. mind you talking 
26 June-2 October about science with 
2011. 


certainty? 

I haven't had any sci- 
entists come after me because of this show. 
I’ve covered my bases — for each thing 
I mention in my raps I can point to the 
research that made me want to highlight 
that element. What is 100% me is the sty- 
listic choice, the decision to expose this part 
of the research over that part. That is based 
on entertainment value. I don’t have to rap 
about the formation of quartz crystals. 


What’s your favourite piece from your new 
show? 

One that goes to the heart of the matter is 
‘Performance, Feedback, Revision, which 
each night contains a freestyle improvised rap 
(see go.nature.com/ubvkdz). The point is that 
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mutation is similar to artistic improvisation. 
Without that randomness, it is difficult to 
create new material that goes in surprising 
directions — and leads the charge on the 
evolutionary development of the artist, or the 
organism. I dramatize the evolving process 
by having an unscripted rap that the audi- 
ence can see is happening in real time. Inever 
know what it’s going to be. 


Can you give a sample of the lyrics from the 
non-improvised part of that song? 

“Yeah, you're just a phenotype, performing 
all the genes inside / 

Living things only seem designed, cause you 
cant see how they've been revised / 

And the feedback lies in evolution’s brutal 
gaze / 

Either you have babies who have babies or 
get booed off stage” 


What are you going to tackle next? 

There is interest in a rap guide to climate 
change, which would be a good challenge. 
Converting people to looking at how evo- 
lution works and accepting it as a reality is 
an intellectual battle that is worth fighting. 
Accepting anthropogenic climate change as 
a reality is an important social and political 
agenda to lend my weight to. But it looks like 
there’s going to be a Rap Guide to Business 
first. New York University’s Stern School 
of Business has expressed an interest in 
commissioning me to summarize its MBA 
programme. 


INTERVIEW BY KERRI SMITH 


A. MELTON 
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Extinctions: 
conserve not collate 


Fangliang He and Stephen 
Hubbell correct an overestimation 
of 160% for species extinction 
rates resulting from habitat 
destruction (Nature 473, 368-371; 
2011). However, near-term 
extinction rates predicted by 
the Millennium Ecosystem 
Assessment still remain at 
400-4,000 times the background 
rate of species extinction. 

Although it may help to refine 
future predictions, we caution 
against their recommendation 
for collating more detailed 
geographical data as an urgent 
priority for conservation science. 

Knowing where species occur 
and their risk of extinction is 
fundamental for deciding where 
to focus efforts to protect them. 
But the diminishing returns 
on the value of biological 
surveys (H. S. Grantham et al. 
Conserv. Lett.1, 190-198; 2008) 
means that more data may not 
translate into significantly better 
decisions. Heterogeneity in the 
costs and likelihood of success 
of conservation actions can 
influence investment priorities 
far more. 

Areas designated a priority 
for species protection, 
identified using the ‘species- 
area relationship, are not 
affected by model uncertainty, 
taxonomic group or the non- 
random distribution of species 
(M. C. Evans et al. Divers. Distrib. 
17, 437-450; 2011). 
Megan Evans, Hugh 
Possingham, Kerrie Wilson 
The University of Queensland, 
Australia. m.evans1@uq.edu.au 


Extinctions: 
consider all species 


We question Fangliang He 
and Stephen Hubbell’s claim 
that species—area relationships 
overestimate global extinction 


(Nature 473, 368-371; 2011). 
We contend that they do not test 
their claims against real data on 
global extinction or threat. We 
also believe that they address 
only a small part of the problem. 

Imagine destruction that wipes 
out 95% of habitat overnight 
— metaphorically speaking. 

How many species will have 
disappeared the following 
morning? He and Hubbell tell 

us it would be just those living 
only in the destroyed area, and 
not in the other 5%. In our view, 
the more important question 

is how many species in total, 
including those in the remnant 
habitat ‘islands (the 5%), will 
eventually become extinct (see M. 
L. Rosenzweig Species Diversity in 
Space and Time Cambridge Univ. 
Press, 1995.) 

Many studies accurately verify 
extinction predictions based on 
the relationship between island 
area and numbers of species, 
which He and Hubbell dismiss. 
Scores of separate tests find 
striking agreement between the 
number of predicted extinctions 
from habitat loss and the number 
of consequent extinctions (or 
of species facing extinction). 
This is seen globally and within 
individual regions, including 
eastern North America, South 
America, Africa and southeast 
Asia (see, for example, S. L. Pimm 
and R. A. Askins Proc. Natl Acad. 
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Sci. USA 92, 9343-9347; 1995). 
Comprehensive analyses 
can now combine remotely 
sensed ecosystem changes with 
information on species extinction 
risk, distribution, habitats, threats 
and conservation actions from 
the International Union for 
Conservation of Nature Red List. 
In our opinion, it is these studies 
— which ask the right questions 
and verify the answers — that 
have crucial implications for 
the world’s efforts to conserve 
biodiversity. 
T. M. Brooks* NatureServe, 
Virginia, USA. 
tbrooks@natureserve.org 
* On behalf of 7 co-signatories 
(see go.nature.com/tsnizs). 


Making society 
more resilient 


Japan’s government would do well 
to consider how society can adapt 
to cope with the uncertainty 
and change caused by sudden 
disastrous natural events — called 
resilience thinking — rather than 
simply trying to overcome and 
eliminate such changes. 
Catastrophic disturbances such 
as tsunamis, wildfires, flooding 
and volcanic eruptions can exact 
a huge human cost. But they may 
also have a positive impact on 
ecosystems, particularly those 
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eroded by human activity. The 
2004 Indian Ocean tsunami, 
for example, restored the beach 
nesting habitats for several 
threatened sea-turtle species (D. B. 
Lindenmayer and C. R. Tambiah 
Conserv. Biol. 19, 991; 2005). 
The ability of ecosystems to 
absorb natural disturbances 
and society's ability to resist 
and recover from them are 
connected. History shows that 
socio-ecological systems that 
are resilient to hazards are less 
devastated by recurring natural 
events such as hurricanes 
(W.N. Adger et al. Science 309, 
1036-1039; 2005). Ignoring the 
connection could lead to more 
unforeseen economic disasters. 
Akira S. Mori Yokohama 
National University, Japan. 
akkym@kb3.so-net.ne.jp 


Population decline 
is along way off 


Fred Pearce uses strong words 

to criticize the United Nations’ 
latest projected global population 
figures (Nature 473, 125; 2011). 
But the UN’s projections ofa 
continuing rise in the population 
(see go.nature.com/wj3br5) are in 
line with its previous projections 
and with those of other major 
sources, including the US Census 
Bureau (see go.nature.com/ 
owcela) and the International 
Institute for Applied Systems 
Analysis (go.nature.com/cbg34l). 

The new UN ‘medium variant’ 
projection expects 10.1 billion 
people by 2100, 3 billion more 
than now. This is a sobering 
prospect for those concerned 
with human and environmental 
poverty. 

In his book The Coming 
Population Crash (Beacon Press, 
2010), Pearce predicts a drastic 
population decline owing to 
falling fertility. But the birth 
rate worldwide still exceeds the 
replacement rate, so the young 
greatly outnumber the old. The 
number of young women coming 


into reproductive age can be three 
times the number becoming post- 
menopausal. So, although women 
are now having fewer children 
than they did previously, the 
number of children remains high. 
The US Census Bureau projects 
no decline in the global number 
of births to 2050. 

The result is that the population 
has risen by a billion people in 
the past 13 years and the UN's 
medium variant expects about the 
same in the next 13 years. 

None of the UN scenarios 
envisages a rise in fertility. If 
fertility stays at its present level, 
the UN projects 27 billion people 
in 2100. Only by assuming a 
continuing and rapid fall in 
fertility do projections come 
down to between 6 and 16 billion. 

Globally, there are 2.5 births for 
each death (see go.nature.com/ 
ows9ux). Population stability, let 
alone a decline, is therefore a long 
way off. For the foreseeable future, 
the world is going to be much 
more crowded than it is now. 
Robert Wyman Yale University, 
Connecticut, USA. 
robert.wyman@yale.edu 


Brazilian soya: the 
argument for 


Your scepticism about a market- 
based approach to conservation 
in the Amazon is ill-founded 
(Nature 472, 5-6; 2011). Itis 
based on a misrepresentation 

of the partnership in Brazil's 
Santarém region between US 
agricultural giant Cargill and 
environmental group The Nature 
Conservancy. 

The aims of the Santarém 
partnership are explicitly 
environmental, not social as you 
claim. It was set up to reduce 
deforestation by enforcing 
Brazil's Forest Code (a federal 
law restricting the amount of 
deforestation) and the soya 
bean moratorium (a voluntary 
agreement by agribusiness not to 
source soya from land deforested 
after 2006). 

The partnership monitors 
farmers’ land-use practices in 
Santarém by satellite and by visits 
on the ground. Its contribution 
is crucial in the absence ofa 
legal mechanism to enforce the 


soya moratorium and, given the 
limited government resources, 
the Forest Code. 

Soya production in 
Santarém comprises less than 
0.5% of the total production 
of the Legal Amazon 
(http://sidra.ibge.gov.br), yet 
this small region receives intense 
scrutiny from scientists and the 
media. Despite this, no evidence 
has emerged that the partnership 
has failed to deter deforestation. 
We must therefore consider what 
the environmental outcome 
would have been had The Nature 
Conservancy not intervened. 
Rachael Garrett Stanford 
University, California, USA. 
rachaelg@stanford.edu 


Brazilian soya: the 
argument against 


Rachael Garrett's arguments 

for a market-based approach to 
Amazon conservation (see above: 
Nature doi:10.1038/474285a; 
2011) hinge on the assumption 
that the expansion of agro- 
industrial development in 
Amazonia is inevitable. Using 
market mechanisms to solve 
environmental problems 

is questionable when those 
problems are themselves caused 
by market-driven expansion. 

It is the relatively small 
soya-production area of Brazil’s 
Santarém region that makes it an 
important case study. If voluntary 
market-based conservation 
programmes do not work even 
onasmall scale, what are the 
chances of success for larger-scale 
programmes such as the Round 
Table on Responsible Soy (see 
go.nature.com/jc6ua1), hailed 
as the way to mitigate problems 
created by agro-industry? 

Conservation organizations 
must face up to the social 
consequences of their 
programmes. The Santarém case 
shows that exclusively addressing 
environmental aspects ofa 
complex problem exacerbates 
socio-political issues. The social 
unrest there correlates with 
environmental degradation in 
the region (C. S. Simmons et 
al. Ann. Assoc. Am. Geogr. 97, 
567-592; 2007). 

Amazonian deforestation has 
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accelerated and extraction of its 
resources have continued under 
the market-based conservation 
paradigm. It is time for a radical 
rethink of the development 
model. 

Brenda Baletti University of 
North Carolina, Chapel Hill, USA. 
bbaletti@email.unc.edu 


Peer reviews: some 
are already public 


Several journals are already 
making anonymized reviewers’ 
reports public for published 
papers, as Daniel Mietchen 
proposes (Nature 473, 452; 
2011). These include Atmospheric 
Chemistry and Physics (see 
go.nature.com/qamrfc) and The 
EMBO Journal (see Nature 468, 
29-31; 2010). But at the European 
Molecular Biology Organization, 
we do not see an equitable way to 
publish referee reports on rejected 
manuscripts. 

Instead, we favour the transfer 
between journals of rejected 
manuscripts, along with full 
referee reports that could be 
made public after acceptance of 
the paper. An extension of this 
might be to release referee names 
after several years, or to sign the 
reports with anonymized digital 
identifiers that could be read by 
official bodies to help evaluate 
academic performance. 

Bernd Pulverer The EMBO 
Journal, European Molecular 
Biology Organization, Germany. 
bernd.pulverer@embo.org 
Competing interests declared 

(see go.nature.com/witfzb). 


Change Chinese 
returnee rules 


Developing countries rely on free 
movement of skilled scientists 
for the inflow of information 
and technology. China's rigid 
citizenship regulations are 
hindering the return of highly 
trained Chinese scientists from 
abroad, and must be changed if 
modernization is to be effective. 
Of more than 1.62 million 
Chinese who left to study abroad 
before 2009, less than one-third 
have returned. China was the 


second largest country of origin 
for science and engineering 
students in US higher education 
in 2009 (see go.nature.com/ 
evj2t9). Almost 90% of Chinese 
scientists and engineers trained 
overseas remained there. 

At present, a Chinese researcher 
naturalized in another country 
sacrifices his or her Chinese 
citizenship and needs a temporary 
visa to return to China. Unless 
foreign citizenship is renounced, 
he or she is denied the right 
to open a bank account, buy a 
house or register a company. This 
bureaucracy deprives the nation of 
scientific and technological know- 
how, entrepreneurial capital, 
international experience and 
access to professional networks. 

One solution would be for 
China to recognize a type of 
dual citizenship, as in India. This 
would allow Chinese scientists to 
enjoy unlimited, visa-free trips 
back to China and preserve such 
rights as access to medical care, 
social security, income tax and 
intellectual-property protection, 
although not the right to vote. 
Jun Li International Centre for 
Research on Environment and 
Development (CIRED), France. 
jun.li@centre-cired.fr 


Worm scientist’s 
identity revealed 


The mystery scientist so 
hauntingly quoted on the 
ubiquity of roundworms in Ralph 
Buschbaumss 1938 textbook 
Animals Without Backbones 
(Nature 474, 6; 2011) is biologist 
Nathan Cobb (1858-1932). 
Cobb’ pioneering work laid 
the foundations for the systematic 
discovery and study of nematodes. 
Members of the Nematoda 
are best known for supplying 
us with the model organism 
Caenorhabditis elegans, but it is 
their abundance and diversity that 
makes them central to biology. 
Cobb would have undoubtedly 
been thrilled, but perhaps not 
surprised, by the discovery of 
his beloved worms more than 
3 kilometres inside Earth's crust. 
Mark J. F. Brown Royal 
Holloway, University of London, 
Surrey, UK. 
mark. brown@rhul.ac.uk 
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NEWS & VIEWS 


ANIMAL BEHAVIOUR 


Born leaders 


In animals that live in groups, some individuals are leaders and others are followers. A modelling study shows that variation 
in leadership evolves spontaneously and need not be related to differences in knowledge or power. 


FRANZ J. WEISSING 


benefits of group living, they have to stay 

together. However, individuals differ in 
their preferences as to where to go and what 
to do next. If all individuals follow their own 
preferences, group coherence is undermined, 
resulting in an outcome that is unfavourable 
for everyone. Neglecting one’s own preferences 
and following a leader is one way to resolve 
this coordination problem. But what attributes 
make an individual a ‘leader’? A modelling 
study by Johnstone and Manica’ illuminates 
this question. 

Writing in Proceedings of the National 
Academy of Sciences, the authors consider a 
famous coordination problem known to game 
theorists as the Battle of the Sexes”. Imagine a 
married couple who want to spend the evening 
together. Husband and wife (the players) can 
either go to a football game or to the opera, 
but they cannot communicate with each other 
about where to meet. Neither wants to miss 
their partner by going to a different event 
from them. If that happens, both get a pay- 
off of zero. When they go to the same event, 
the wife would prefer the opera, whereas the 
husband would prefer the football game. 
When meeting at the same event, the players 
get the pay-offs 1 and 1— k (where 0<k<1), 
depending on whether or not they realize their 
preferred option. 

Johnstone and Manica model such an inter- 
action (and generalize it to the case of more 
than two players). They assume that the same 
players interact repeatedly, and that each time 
they can either choose their preferred option 
or copy the previous action of the other player. 
Each player is characterized by a strategy, A, 
corresponding to the player’s probability of 
sticking to his or her preferred action. This 
strategy is viewed as a player’s degree of leader- 
ship: players with a high value of A are leaders, 
in that they ignore the actions of others and 
obey their own preferences; players with alow 
value of A are followers, in that they tend to 
copy the choices of others. 

Johnstone and Manica’ investigate how 
natural selection shapes intrinsic leadership 
in a population in which individuals produce 
offspring in proportion to their pay-off in 


S ocial animals face a dilemma. To reap the 


288 | NATURE | VOL 474 | 16 JUNE 2011 


Figure 1 | Out in front. An implication of Johnstone and Manica’s model’ is that leadership may simply 
reflect an intrinsic tendency to follow one’s own preferences. 


the coordination game. A population of only 
leaders (A = 1) is not evolutionarily stable: if 
both players stick to their preferred option, 
they will never meet and will get a pay-off of 
zero. Likewise, a population of only followers 
(A = 0) is not stable, because the players will 
again miss each other if both have a tendency 
to dither, continually switching to the previ- 
ous action of the other. Instead, the population 
will first evolve to an intermediate value of A 
(say, A = 0.5). But, intriguingly, this is not the 
final outcome. From the intermediate strategy, 
the population will diversify and evolve to a 
state where two strategies coexist — a leader 
strategy (say, A = 0.9) and a follower strategy 
(say, A = 0.1). 

This outcome makes intuitive sense, because 
a leader-follower pair of players is most effi- 
cient in solving the coordination problem: both 
will eventually choose the preferred option of 
the leader. The leaders seem to have the better 
part (a pay-off of 1 is higher than a pay-off of 
1-k), but this holds only when they are teamed 
up with a follower. On the population level, 
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leaders and followers have the same average 
pay-off. This is because leaders are more fre- 
quent than followers (because of their higher 
pay-off in leader—follower interactions) and 
therefore find themselves relatively often 
teamed up with another leader (giving a 
pay-off of zero). 

These results’ are interesting for several 
reasons. First, they provide an explanation 
for empirical observations in the lab and field. 
For example, experiments with sticklebacks* 
have revealed pronounced individual differ- 
ences in the tendency to lead that resemble 
those in the model. Second, the results show 
that leadership and ‘followership can evolve in 
the absence of any other differences between 
individuals. In the behavioural sciences, there 
is much discussion about which traits make 
someone a leader*. According to Johnstone 
and Manica’s model, leadership need not be 
associated with being better informed, being 
more dominant or having superior commu- 
nication skills. Instead, leadership may simply 
reflect an intrinsic tendency to follow one’s 
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own preferences and disregard the choices of 
others (Fig. 1). 

The third interesting aspect is that the paper 
provides a link to the issue of animal ‘person- 
alities’*, the phenomenon that animals differ 
systematically in their behaviour in a manner 
that is individually stable across a variety of 
contexts. In nature, leadership seems to be a 
personality trait that is correlated with general 
activity level, aggressiveness and boldness’. 
Johnstone and Manica provide a neat expla- 
nation for the emergence of individual differ- 
ences in leadership, but it is an open question 
how such correlations between leadership and 
other personality traits have evolved. 

The type of model presented by Johnstone 
and Manica sacrifices realism for conceptual 
clarity and analytical tractability. It remains to 
be seen whether the results are robust when 
more-realistic assumptions are incorporated 
or more-complex strategies are considered. 
For good reason, the authors have assumed 
that the players do not differ in features such 
as knowledge and power. In more realistic set- 
tings, asymmetries between the players will 
undoubtedly occur; such asymmetries can 
help to solve a coordination problem’. 

Moreover, even in symmetrical settings, 
differences in leadership will not necessarily 
evolve if more-complex strategies are avail- 
able. An example can be taken directly from 
the authors’ experimental work’: sticklebacks 
that have diverging preferences take turns in 
leadership, rather than specializing in the roles 
of leader and follower. Perhaps most impor- 
tantly, a group of individuals engaging in 
prolonged interactions can be expected to learn 
each others’ characteristics (for example, their 
degree of leadership). It would be worthwhile 
investigating how the evolutionary outcome 
would change if individuals could signal their 
leadership tendencies — as humans clearly do. 

This work' may be criticized for its restricted 
view of leadership. One could argue that the 
‘leaders’ in the model do not really lead, but 
simply refrain from following others. Lead- 
ers are defined as being stubborn, refusing to 
react to their fellow group members. Accord- 
ingly, the evolution of differences in leadership 
in this model bears some resemblance to the 
evolution of individual variation in responsive- 
ness* and social sensitivity’ seen in other mod- 
els. In reality, there are more dimensions to 
leadership, and it is not obvious that stubborn- 
ness and antisocial behaviour are characteris- 
tic features of leaders. In African elephants, for 
example, the most responsive and socially sen- 
sitive individuals have the highest propensity 
to become leaders of the herd”. 

Johnstone and Manica’s concept of lead- 
ership seems to be most easily applicable to 
fish shoals and other anonymous societies. 
Still, even for highly structured social sys- 
tems such as those of humans and elephants, 
their insight provides clues to how intrinsic 
differences in leadership could evolve as a 


fundamental means to resolve the tension 
between individual interests and the desire to 
live in a group. = 
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Stop the nonsense 


A subtle biochemical alteration can reprogram signals that herald the termination 
of protein translation into signals encoding amino acids at the level of messenger 
RNA — and without altering the corresponding DNA. SEE LETTER P.395 


ADRIAN R. FERRE-D’AMARE 


he amino-acid sequence of a protein is 

specified by combinations of 64 trinu- 

cleotides (or codons) in the correspond- 
ing messenger RNA. Of these, three codons, 
known as termination or nonsense codons, 
signal the end of protein translation. Some- 
times, however, rather than stopping protein 
synthesis, the translation machinery decodes a 
termination codon as an amino acid in what is 
known as nonsense suppression. On page 395 
of this issue, Karijolich and Yu! report an arti- 
ficial way of inducing nonsense suppression 
— through post-transcriptional conversion of 
the uridine residue in termination codons into 
its isomer, pseudouridine. This finding raises 
fundamental questions about the biochemistry 
of protein synthesis and has implications for 
treating genetic diseases. 

Translation takes place in cellular orga- 
nelles called ribosomes, in which each mRNA 
codon is matched with the anticodon of an 
aminoacyl-tRNA. The latter is a transfer RNA 
that has been loaded by its cognate aminoacyl- 
tRNA-synthetase enzyme with the amino acid 
corresponding to its anticodon. None of the 
tRNAs has anticodons complementary to the 
termination codons; normally, proteins called 
release factors (RF1 and RF2 in bacteria, eRF1 
in eukaryotes) recognize the nonsense codons. 
But ifa tRNA undergoes a mutation in its anti- 
codon such that it becomes complementary 
to a termination codon (and if this mutant 
tRNA is otherwise recognized normally by its 
aminoacyl-tRNA synthetase and the rest of the 
translation machinery), it might lead to misin- 
terpretation of the termination codon. 

Indeed, such nonsense suppression by 
mutated tRNAs is well documented’. The 
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findings of Karijolich and Yu’ are surpris- 
ing, however, because of their significance for 
the mechanism by which release factors are 
thought to recognize termination codons, and 
because of the structural similarity between 
pseudouridine (Y) and uridine (U). 

The crystal structures of the bacterial ribo- 
some with its release factors caught in the act 
of recognizing termination codons” indi- 
cate how RF1 and RF2 recognize the U of 
all three termination codons (UAA, UAG or 
UGA): chemical groups in the backbone of 
these release factors seem to form hydrogen 
bonds with groups on the face of U that nor- 
mally participate in hydrogen bonding with 
another nucleotide — the Watson-Crick face. 
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Figure 1 | Uridine and pseudouridine. Uridine 
(U) — the first residue of the three termination 
codons — can be converted into its isomer 
pseudouridine (‘Y) ina reaction catalysed by 
pseudouridine synthase enzymes. Karijolich and 
Yu' show that conversion of U to ¥ can transform 
a termination codon into an amino-acid-coding 
signal. The Watson-Crick faces of U and ¥ are 
identical, but they differ in other details — ¥, for 
instance, has an imine group (NH) that projects 
into the major groove of the RNA. Thick lines 
denote the glycosidic bond that joins the bases to 
the RNA backbone (R). 
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Although Y and U differ in that the former 
has a carbon-carbon, rather than a carbon- 
nitrogen, glycosidic bond and an imine (NH) 
group that it projects into the major groove of 
the RNA, the Watson-Crick faces of these two 
residues are identical (Fig. 1). Thus, release 
factors should be insensitive to conversion 
of the termination codons to PAA, PAG 
or ¥GA. 

What, then, is the property of  — other 
than its ability to form Watson—Crick base pairs 
— that gives rise to nonsense suppression? Y 
binds water through its major-groove imine 
group, and this hydration makes -containing 
RNAs stiffer®. It could be that, when the 
release factors bind the ¥-containing mRNA, 
the increased energy needed to dehydrate 
this modified mRNA results in nonsense sup- 
pression. Alternatively, nonsense suppression 
could be a consequence of the greater difficulty 
in unstacking the isomer-containing termina- 
tion codon from the previous codon as the 
isomerized codon is brought into the ‘reading’ 
position on the ribosome. Regardless of 
the physico-chemical basis, however, the 
new results point to a crucial role for factors 
other than Watson-Crick base pairing in the 
recognition of termination codons. 

Karijolich and Yu demonstrate nonsense 
suppression through pseudouridylation of 
termination codons both in vitro and in yeast. 
When the authors characterized the proteins 
synthesized following nonsense suppression, 
they uncovered another surprise. Rather than 
incorporating a random amino acid at the 
site occupied by the isomerized termination 
codon, the translation machinery specifically 
incorporates either serine or threonine at VAA 
and ¥ AG, and either tyrosine or phenylalanine 
at ¥GA. 

This observation is noteworthy because, 
although the two sets of amino acids have 
chemical commonalities (threonine and ser- 
ine both have a hydroxyl group, and tyrosine 
and phenylalanine share a phenyl ring), the 
anticodons of tRNAs for the four amino acids 
do not show any obvious complementarity to 
the termination codons. Mechanistically, this 
implies that pseudouridylation of termination 
codons leads not only to a loss of recognition 
by release factors, but also to a gain of recogni- 
tion by specific aminoacyl-tRNAs. The fidelity 
of normal translation is enhanced through a 
proofreading process in which the accuracy 
of codon-anticodon pairing is communicated 
across the ribosome to the amino-acylated 
(acceptor) end of tRNA. Perhaps pseudo- 
uridylation of termination codons also affects 
this process. 

Site-specific enzymes called pseudouridine 
synthases produce Y from U residues of cel- 
lular RNAs‘. Eukaryotes and archaea have 
a versatile class of pseudouridine synthases 
called H/ACA ribonucleoproteins (RNPs)’. 
These complexes have their four core proteins 
in common, but each assembles using one of 
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many different RNAs (containing evolutionar- 
ily conserved sequence elements called H and 
ACA). The RNA component is called a guide 
RNA because it has a stretch of nucleotides 
complementary to the sequences that flank 
the uridine of the substrate RNA targeted 
for pseudouridylation. This sequence com- 
plementarity is necessary and sufficient for 
directing the H/ACA RNP to pseudouridylate 
acellular RNA in vivo. 

Karijolich and Yu' used a custom-designed 
H/ACA guide RNA to target termination 
codons for pseudouridylation in their yeast 
experiments. The authors point out that this 
would also be an attractive approach to treating 
genetic disorders that result from premature 
termination of translation. Indeed, more than 
a third of genetic disorders and many cancers 
are due to mutations that introduce premature 
termination codons*. So, rather than having 
to correct the mutation at the level of DNA, 
all that is required would be delivery of an 
H/ACA guide RNA to pseudouridylate the 
defective mRNA. 

More broadly, it is possible that nature is 
already using this kind of ‘gene therapy’ to 
increase the coding capacity of genomes. 


CONDENSED-MATTER PHYSICS 


Karijolich and Yu have found several candi- 
date mRNAs whose termination codons could 
be subjected to pseudouridylation by previ- 
ously described H/ACA guide RNAs. Such 
mRNAs would produce a shorter protein in 
their unmodified state and a longer protein 
(ending at a second, unmodified termination 
codon) when the first termination codon is 
pseudouridylated. = 
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Microscopy of the 
macroscopic 


The presence of magnetic moments in materials known as Kondo lattices can lead 
to an exotic transformation in their properties. The first successful endeavour 
into imaging such a transformation has now been made. SEE LETTER P.362 


PIERS COLEMAN 


he availability of electron microscopes 
after the Second World War made it 
possible to image life at the subcellular 
level, prompting a revolution in cell biology. 
Today, a new type of microscope that images 
electrons at the quantum level — the scanning 
tunnelling microscope — is helping to drive a 
revolution in condensed-matter physics. But 
whereas biologists look on the microscope as 
a path to greater understanding of life at the 
microscopic molecular level, physicists are 
increasingly seeking to use it to elucidate the 
emergent, macroscopic properties of electrons, 
such as high-temperature superconductivity 
and magnetism. On page 362 of this issue, 
Ernst et al.’ use the scanning tunnelling micro- 
scope to image the profound transformation 
that a metal structure known as a Kondo lattice 
undergoes as it dissolves a lattice of magnetic 
moments. 
A scanning tunnelling microscope uses the 
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quantum-mechanical nature of electrons to 
image the electronic organization in metals. 
Although electrons lack the energy to escape 
from a metal, when a sharp metal tip is brought 
within a few angstroms of another host metal, 
electrons can quantum mechanically tunnel 
through the vacuum between the two. The tun- 
nel current that this generates is determined by 
the number of available electron states in the 
host metal with energies up to a voltage applied 
between the two metals. So by measuring the 
current as a function of applied voltage, one 
obtains the detailed electron energy spectrum 
of the host metal at the tip location (Fig. 1). 
Modern scanning tunnelling microscopes 
allow one to scan the tip across the metal, 
gradually building up a spatial profile of the 
electron energy spectrum with subangstr6m 
resolution. The development of the control 
and isolation technology necessary to hold 
a tunnelling tip with subangstrom fidelity, 
even as the temperature is varied, has taken 
several decades of development, but today the 


approach is poised to open up a whole genera- 
tion of electronic media to levels of scrutiny 
that, until recently, could only be imagined. 

One of the long-standing interests of 
condensed-matter physicists concerns the 
interaction of magnetism with electrons. 
Magnetism is driven by unpaired electrons 
that have become localized in an atomic orbital 
to form tiny magnets, or magnetic moments. 
Materials with partially filled f- and d-orbitals 
are particularly prone to forming magnetic 
moments. The high-powered rare-earth neo- 
dymium magnets used inside the motor of a 
modern hybrid car, for example, rely on the 
magnetism associated with f-electrons. 

Sometimes, such magnetic moments can 
actually dissolve inside a metal, profoundly 
modifying the metal’s properties, driving up 
the effective mass of the mobile electrons by 
several orders of magnitude to form ‘heavy 
electrons. This situation can be likened to 
the behaviour of ions in water, in which the 
electrically polar water molecules ‘screen’ the 
ions and dissolve them into the solution. In the 
course of this process, the ions become mobile, 
and change the conductivity and chemistry of 
the solution. Similarly, in a metal the magneti- 
cally polar electron fluid tends to screen local 
magnetic moments immersed in it. During 
this process, the spin originally on the mag- 
netic moment delocalizes as a charged heavy 
electron. 

But whereas the solvation of ions is ther- 
mally driven — improving at high tempera- 
tures — the screening of magnetic moments 
in a metal is a quantum process. For the 
magnetic moment to be screened, it needs 
to become quantum mechanically entangled 
with the magnetic moment of the surround- 
ing electrons. Entanglement requires that the 
quantum-mechanical phase difference of the 
electrons involved remains well defined, and 
this occurs only at low temperatures, in which 
the ‘dephasing’ effects of temperature are 
negligible. 

The screening of magnetic moments inside 
a metal is called the Kondo effect after its 
discoverer, Jun Kondo. Several years after 
Kondo’ pioneering work’, Sebastian Doniach? 
showed that the same physics can occur in 
materials containing a dense array of magnetic 
moments immersed in a metal, called Kondo 
lattice materials. In these materials, the disap- 
pearance of the magnetic moments into the 
electron fluid profoundly changes the bulk 
properties, lowering the resistivity of the metal, 
and sometimes producing superconductivity. 

In their study, Ernst et al.' describe the first 
successful attempt to image the magnetic 
screening process in a Kondo lattice material 
using scanning tunnelling electron microscopy 
(STM). The observation of the Kondo effect at 
single magnetic ions was one of the early suc- 
cesses of STM, but it is only in the past year 
that the first successful STM measurements 
were carried out*” on an f-electron metal, 
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Figure 1 | Electron tunnelling from tip to host. Application of a voltage (-V) to the tunnelling tip 
causes a current of electrons (J,) to tunnel through the vacuum into the host metal. The derivative of the 
current (dI./dV) determines the energy spectrum g(V) of the metal as a function of voltage; (x, y) denotes 
the coordinates of the point at which the current enters the metal. Ernst et al.’ observe a crater-like dip in 
the energy spectrum around each ytterbium site in the ytterbium rhodium material YbRh,Si,. 


the uranium ruthenium compound URu,Si,. 
However, URu,Si, is complicated by the 
multi-electron physics of the uranium 
ion. Building on this recent success, Ernst 
et al.' chose an ytterbium rhodium mater- 
ial, YbRh,Si,, which has the same structure 
as URu,Si, but in which the local magnetic 
moments form an unambiguous Kondo lattice. 

In their single-atom-resolution STM 
images, Ernst et al.’ observe a crater-like dip in 
the tunnelling current around each ytterbium 
site (Fig. 1). This feature is a classic interference 
signature of the Kondo effect. The interference 
causes the STM signal to die out at the location 
of the magnetic moments. And this is precisely 
what is seen: as the temperature is lowered and 
the magnetic moments become entangled with 
the system’s conduction electrons, a dip devel- 
ops in the material's electronic density of states. 
Moreover, at an applied voltage of 6 millivolts, 
the authors see a small feature that develops 
below 30 kelvin, which they identify as the res- 
onant state associated with the Kondo effect. 
Unlike previous work on single magnetic 
moments, here the authors see these signatures 
in every unit cell of the lattice on the surface of 
the material, establishing the dense nature of 
the magnetic screening. 

A second important feature of the new 
results' is the observation of a magnetic effect 
known as crystal-field excitations in an STM 
measurement. Ina crystalline environment, 
the energy levels of a magnetic moment are 
split into several crystal-field excitations. 
Crystal-field excitations are neutral and don't 
directly couple to electron currents, yet Ernst 
and colleagues observe them directly in the 
electronic density of states — a definite indica- 
tion that, through screening, the magnetic ions 
have become coupled to the mobile electrons. 
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Moreover, the crystal-field excitations are in 
precisely the right position as measured by 
neutron-scattering experiments. The authors 
are able to confirm this interpretation through 
detailed calculations on the Kondo effect in a 
crystalline lattice environment. 

Perhaps the most exciting aspect of these 
measurements’ is the prospects they hold for 
the future. The current measurements have 
all been made at above 4.5 kelvin, but at lower 
temperatures the quantum fluctuations associ- 
ated with the Kondo lattice become so intense 
that the masses of the heavy electrons diverge, 
leading to a ‘quantum critical point’ at which 
the heavy electrons are thought to break up 
into magnetic spins®. At a critical point, the 
fluctuations expand to macroscopic dimen- 
sions — something that can be seen in the 
opalescence of superheated water at the criti- 
cal point. The analogous quantum criticality, 
involving quantum fluctuations of macro- 
scopic dimensions, has never been observed 
directly. The excitement of Ernst and col- 
leagues’ results lies in the distinct possibility 
that such macroscopic criticality in the quan- 
tum waves of the Kondo lattice might soon be 
directly imaged by STM. = 
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STRUCTURAL BIOLOGY 


Porthole to catalysis 


The crystal structure of a sugar-transferring enzyme offers insight into the 
mechanism of a ubiquitous protein- modification reaction, and solves the mystery 
of how the enzyme recognizes certain sequences in proteins. SEE ARTICLE P.350 


REID GILMORE 


ne of the most common protein- 
() modification reactions in the cells of 

eukaryotes (organisms that include 
plants, animals and fungi) is N-linked 
glycosylation, in which sugars are attached to 
the side chain of the amino acid asparagine. 
This reaction has diverse roles in protein fold- 
ing and stability, intracellular trafficking and 
cell-cell interactions, and is catalysed by the 
enzyme oligosaccharyltransferase (OST). 
More specifically, OST mediates the transfer 
of an oligosaccharide from a donor substrate 
onto acceptor asparagine residues in newly 
synthesized proteins. In eukaryotes, such 
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glycosylation occurs only at asparagine resi- 
dues located within asparagine-X-threonine/ 
serine amino-acid sequences (N-X-T/S 
sequences), where X can be any amino acid 
except proline. It has been unclear how OST 
recognizes the acceptor sites and the associated 
amino-acid sequences, and how it activates the 
normally unreactive nitrogen in asparagine’s 
side chain to take part in glycosylation. But on 
page 350 of this issue, Lizak et al.' now provide 
remarkable insight into these issues with their 
report of the X-ray crystal structure of PgIB, an 
OST from the bacterium Campylobacter lari. 
Although often described as post-transla- 
tional modifications, most N-linked oligo- 
saccharides in eukaryotic cells are added 
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Figure 1 | Oligosaccharide transfer reactions. In eukaryotic cells, proteins made by the ribosome 

are passed across the endoplasmic reticulum (ER) membrane through a protein-translocation 

channel. Oligosaccharyltransferase (OST) proteins adjacent to the channel catalyse the attachment of 
oligosaccharides to the proteins in N-glycoslyation reactions before the proteins acquire secondary or 
tertiary structure. For simplicity, only the active-site subunit of a eukaryotic OST (known as STTS3) is 
depicted. The oligosaccharides are assembled on lipid carriers (dolichol pyrophosphate molecules; P is 
a phosphate group) in the membrane, and are attached to asparagine (N) side chains within N-X-T/S 
acceptor sequences, where T is threonine, S is serine and X is any amino acid other than proline. Lizak and 
colleagues’ crystal structure’ of PgIB (a bacterial protein analogous to STT3 in eukaryotes) in complex 
with an acceptor peptide suggests how OST active-site subunits scan nascent polypeptides for N-X-T/S 
sequences while positioning the lipid-linked oligosaccharide in the catalytic site. 
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co-translationally to a nascent polypeptide as it 
is threaded through the protein-translocation 
channel in the endoplasmic reticulum (ER) 
membrane’ (Fig. 1). The preassembled oligo- 
saccharide is anchored by a lipid to the luminal 
surface of the ER. Consequently, the OST has 
to scan nascent polypeptides for acceptor sites 
that are moving past the enzyme at the rate 
of protein synthesis (about 6-8 residues per 
second for eukaryotes), while simultaneously 
positioning the lipid-linked oligosaccharide 
(LLO) in the enzyme’ active site. 

Unlike the oligomeric OST complexes that 
are present in eukaryotes’, PglB is mono- 
meric, making it a more suitable candidate for 
protein crystallography. PglB consists of an 
amino-terminal domain that binds to the inner 
bacterial membrane, and a carboxy-terminal 
domain that resides in the periplasm (the space 
between the inner and outer bacterial mem- 
branes). Proteins are glycosylated by PgIB as 
they thread through the plasma membrane. 
The enzyme’s periplasmic domain includes 
a short, evolutionarily conserved amino-acid 
sequence (tryptophan-tryptophan-aspartic 
acid, abbreviated as WWD) that is essential in 
all OST catalytic subunits*”. The structure of 
this domain in PgIB from the archaeon Pyro- 
coccus furiosus has been solved previously’, but 
the structure of the membrane-bound domain 
was unknown, not least because crystallizing 
integral membrane proteins is always techni- 
cally challenging. Lizak et al.' surmounted 
this challenge by crystallizing the complete 
Campylobacter PgIB protein in the presence 
of its peptide substrate. 

The authors’ structure reveals that two large 
loops (EL1 and ELS) of the N-terminal domain 
extend into the periplasm to form a platform 
that supports the periplasmic domain, and 
contain residues required for peptide binding 
and catalysis. Remarkably, the peptide-binding 
cleft is formed by the interface between the 
membrane-binding and periplasmic domains, 
and is located on the opposite face of PglB from 
the cleft that harbours the catalytic site and the 
LLO-binding site. This architecture explains 
how the large LLO and peptide substrates can 
independently enter the enzyme's active site. 
The structure is consistent with biochemical 
studies’ that indicated that acceptor sequences 
must be located within flexible or unfolded 
segments of polypeptides. It also shows that 
the critical WWD motif binds to the threonine 
side chain of the peptide substrate, thereby 
explaining why serine or threonine must be 
located two amino acids after asparagine in 
the N-X-T/S acceptor sequence. 

Another intriguing feature of the structure’ 
is that the asparagine side chain in the peptide 
projects through a narrow ‘porthole’ of PgIB 
that connects the peptide-binding cleft to the 
catalytic site. This site is formed by a cluster of 
charged residues that bind a magnesium ion 
(Mg”*). Although OST activity has long been 
known to be dependent on a divalent cation 


(either Mg* or the manganese ion, Mn”*), the 
catalytic-site residues had not been identified. 
Lizak et al. report that three of the active-site 
residues have acidic side chains (they are 
aspartic acid or glutamic acid residues). When 
the authors replaced any of these residues with 
alanine, which has a non-acidic, methyl side 
chain, the glycosylation activity of the resulting 
enzymes was reduced by 50-90% compared 
with the wild-type enzyme. 

The side chain of asparagine contains an 
amide group (CONH,) in which the nitro- 
gen atom is a ‘weak nucleophile’ — which 
means that it shouldn't react readily with the 
oligosaccharide of the LLO. So how does the 
enzyme activate the amide so that it can react? 
Previous models for OST catalysis envisaged 
that activation occurred through the formation 
of hydrogen bonds between the asparagine 
side chain and the threonine (or serine) resi- 
due in the peptide substrate*”. But Lizak and 
colleagues’ finding’ that the asparagine residue 
projects through the porthole in the active site 
rules out this possibility. 

Instead, the authors propose that two 
active-site residues form hydrogen bonds 
to the amide hydrogens of asparagine’s side 
chain, and in so doing increase the ability of 
the amide to react with the oligosaccharide. 
When the researchers replaced these catalytic 
residues with others to alter the hydrogen 
bonding to the asparagine side chain, the 
resulting mutants lost their catalytic activity, 
thus providing strong support for the proposed 
scheme. This activation mechanism could also 
explain why a glutamine-containing pseudo- 
acceptor sequence (Q—X-T/S, where Q is 
glutamine) can bind to the OST active site, yet 
not be glycosylated — the glutamine side chain 
cannot form properly positioned hydrogen 
bonds to the active site’. 

Following glycosylation, the asparagine 
side chain on the polypeptide substrate is 
covalently linked to a bulky oligosaccharide 
that cannot pass through the narrow active- 
site porthole. So how can the reaction prod- 
uct leave the active site? Lizak and colleagues’ 
structure’ reveals that the porthole is formed 
by the packing of the flexible, partially dis- 
ordered EL5 loop against the periplasmic 
domain. They therefore propose that product 
formation induces a conformational change 
that promotes disengagement of EL5 from the 
periplasmic domain, followed by release of the 
glycosylated product and the lipid anchor. 

Lizak and colleagues’ crystal structure of 
PglB in complex with a peptide substrate 
provides invaluable information about PglB- 
peptide binding and the enzyme’s catalytic 
mechanism, but higher-resolution structures 
are needed to provide detailed insight into the 
mechanism. What’s more, the location of the 
LLO binding site in PglB cannot be precisely 
known until the structure of a PgIB-LLO com- 
plex is solved. Such a structure should also 
reveal why N-acetylglucosamine (a sugar that 


contains an NHCOCH; group in place of one 
OH group) is present at the reducing end ofall 
eukaryotic LLO donors — that is, at the end 
that connects the oligosaccharide to the lipid”. 
Finally, the structure of an enzyme-product 
complex would provide insight into how the 
product dissociates from the enzyme. m 
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Early black holes 


uncovered 


Collective X-ray emission from distant galaxies reveals a hidden population of 
supermassive black holes. This finding suggests that galaxies and their central 
black holes have been coevolving since early cosmic times. SEE LETTER P.356 


ALEXEY VIKHLININ 


he formation of the first stars and black 
"Tins a few hundred million years after 

the Big Bang marks the end of the cos- 
mic ‘dark ages’. These objects heat and ionize 
the surrounding gas, and by a billion years after 
the Big Bang (or redshift (z) of about 6) they 
have ionized nearly all’ of the hydrogen in the 
Universe. This grand picture is mostly a prod- 
uct of theoretical modelling — data on the first 
generations of objects are scarce. Only recently 
have candidate galaxies at z higher than 6 been 


identified in deep Hubble Space Telescope 
images’. What about the supermassive black 
holes that presumably lie at their centres? A 
small number of quasars (extremely lumi- 
nous galactic nuclei powered by supermassive 
black holes) at z of about 6 have been discov- 
ered in the Sloan Digital Sky Survey’. But these 
objects are so luminous and massive that they 
should represent only the very tip of the ice- 
berg of the overall black-hole population. On 
page 356 of this issue, Treister et al.” present a 
detection of signal from much more ‘typical’ 
black holes, at z about 6, found quite literally 


Figure 1 | Teaming up Chandra with Hubble. By comparing and cross-correlating a Chandra X-ray 
Observatory image of a patch of the sky known as Deep Field South (a) with observations of the same 
area taken by the Hubble Space Telescope (b), Treister and colleagues’ discovered a population of 

supermassive black holes at early cosmic times. 


© 2011 Macmillan Publishers Limited. All rights reserved 


16 JUNE 2011 | VOL 474 | 


NATURE | 293 


A, NASA/JHU/AUI/R.GIACCONI ET AL.; B, NASA, ESA, G. ILLINGWORTH & 


R. BOUWENS (UNIV. CALIFORNIA, SANTA CRUZ), THE HUDFO9 TEAM 


| RESEARCH | NEWS & VIEWS 


by sorting through the X-ray photons in the 
deepest images taken by the Chandra X-ray 
Observatory. 

X-rays are the ubiquitous observational 
signature of gas falling into black holes. X-ray 
emission more easily penetrates dense gas 
and dust clouds than does optical or ultra- 
violet (UV) light, and so it is considered to be 
one of the cleanest methods of finding active 
black holes. Chandra’s image of the ‘Deep Field 
South’ patch (Fig. 1) indeed contains hun- 
dreds of active black holes, mostly in quasars 
at z = 1-3 (ref. 6). However, no black holes at 
z greater than 6 have hitherto been found in 
X-ray observations. The Chandra images are 
too small to contain rare ‘monsters’ such as the 
Sloan quasars, and are not sensitive enough to 
detect more typical supermassive black holes at 
these high redshifts. However, we know from 
Hubble observations of the same field that, in 
the area between detected Chandra sources, 
there are hundreds of galaxies at z higher 
than 6. 

In their study, Treister et al.” cross-correlated 
the Chandra X-ray photons with the locations 
of the Hubble galaxies and found a positive 
signal. The observed X-ray signal is very low 
— less than five X-ray photons per galaxy — 
and is detectable only because it was averaged 
over almost 200 high-z galaxies, yielding an 
effective exposure time of 23 years. But the 
statistical significance of this detection is high 
(nearly 7 sigma, to use the technical jargon), so 
we have a confident detection of X-ray emis- 
sion from a population of typical supermassive 
black holes hosted by high-z galaxies. 

A far-reaching finding of this analysis’ is the 
X-ray ‘colours’ of these high-z galaxies. The 
observed X-ray emission spectrum peaks at 
energies greater than 3 kiloelectronvolts (keV). 
Because the Universe's expansion stretches an 
object's light waves by a factor of 1 + z, this cor- 
responds to a spectral peak at energies greater 
than 20 keV in the galaxies’ reference frame. 
This observation strongly indicates that the 
central black holes in nearly all high-z galax- 
ies within the Chandra field are blanketed in 
large amounts of cold gas that absorbs softer 
(lower-energy) X-rays. This same gas would 
completely absorb the optical and UV emis- 
sion, making these black holes detectable only 
in the X-ray band. 

The authors estimate the total mass of the 
newly discovered black-hole population using 
the notion’ that about 10% (within a factor of 
a few) of rest energy of infalling matter is radi- 
ated away, and by further assuming that the 
black holes have been accreting at the same 
rate for a substantial fraction of the age of the 
Universe at their redshift. A limiting factor 
in such an estimate is that only a small frac- 
tion of the black-hole luminosity is in the 
hard (high-energy) X-ray regime directly 
observed by Chandra. Another caveat is that 
more than 95% of the black holes’ luminosity, 
emitted mostly in the UV band, never reaches 
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outside the galaxies because of absorption. 

The reader might therefore conclude that the 
authors’ estimate of the total mass of the high-z 
black-hole population, which is based solely on 
Chandra’s X-ray emission, is highly uncertain, 
even though such a high uncertainty is usually 
unavoidable for ground-breaking astronomi- 
cal observations. With these reservations in 
mind, it is still intriguing that Treister et al. 
conclude that the estimated black-hole masses 
are consistent with a hypothesis in which the 
relationship between galaxy mass and black- 
hole mass that is observed in the local Uni- 
verse®” is already established a billion years 
after the Big Bang. 

Treister and colleagues’ results” have impli- 
cations for many studies of the early Universe. 
Unfortunately, however, answers to some key 
questions — such as how the progenitors of 
these early supermassive black holes were gen- 
erated, or the exact mechanism that underlies 
the coevolution of the black holes and their 
host galaxies — will probably have to wait 
for the next generation of telescopes. These 
telescopes should be capable of detecting the 


GENE EXPRESSION 


objects individually in the X-ray band and 
of observing their absorbed and re-radiated 
emission in the sub-millimetre and far-infra- 
red regimes. But Treister et al. have taken the 
first, most difficult, step: the detection of a 
typical population of supermassive black holes 
near the end of the cosmic dark ages. = 
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The autism disconnect 


Separating primary from secondary changes in the autistic brain has long been a 
research goal. With knowledge of wide-ranging molecular deficits, identification 
of the best therapeutic targets becomes a priority. SEE LETTER P.380 


ZELUKA KORADE & KAROLY MIRNICS 


utism spectrum disorder is a complex 
A zestconmental condition. It is 

characterized by altered social interac- 
tions, communication difficulties and repeti- 
tive patterns of behaviour. There is no known 
single cause of autism, but it is believed that 
genetic predisposition together with environ- 
mental influences lead to molecular changes 
in brain cells, altering normal brain develop- 
ment’. On page 380 of this issue, Voineagu 
et al.” present the first appropriately powered 
and comprehensive gene-expression analysis of 
autistic brains using cutting-edge technologies 
and excellent data-mining approaches. 

The authors measured messenger RNA 
levels for more than 30,000 genes in three 
regions of post-mortem brains from 19 patients 
with autism and 17 controls. They identified 
444 genes that were differentially expressed 
between the cerebral cortices of the autis- 
tic and control brains. They then replicated 
most of the data with a second, independent 
cohort of post-mortem brains. 

Brain regions differ in cellular composition, 
connectivity and molecular signatures, which, 
together, lead to functional specialization. In 
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humans, for example, the prefrontal cortex 
is primarily responsible for higher cognitive 
processes such as working memory, whereas 
the temporal cortex is crucial for auditory per- 
ception and semantics. Voineagu et al. report 
that the differential patterns of gene expres- 
sion that normally distinguish the frontal and 
temporal cortices are significantly attenuated 
in the autistic brain. This disappearance of 
differential gene expression — which may also 
occur in other regions of the autistic brain — 
is very intriguing. Loss of cortical patterning 
may impair connectivity between the brain 
regions and, ultimately, weaken the specialized 
functions of the cortical areas. 

To investigate the functional relationship 
between gene-expression changes, the authors” 
used two distinct data-mining methods. One 
method groups genes together on the basis of 
their known cellular, molecular or functional 
characteristics’; the other builds functional 
gene modules according to the observed 
co-expression relationships’. The authors 
identified two distinct gene-expression mod- 
ules associated with autism that might act in 
concert to disrupt typical brain development. 
One of the modules, which mediates synaptic 
communication between neurons, showed 


decreased expression in the autistic 
cortices. The other was related to 
immune-system activation in the 
brain and showed increased expres- 
sion. These observations are con- 
sistent with previous findings”® that 
the brains of patients with autism 


Genetic 
predisposition 


¥ 


show altered expression of genes Differential | ex Altered gene- 
relating to normal synapse devel- mRNA splicing | expression program 
§ ynap | & 


opment and increased expression 
of genes mediating inflammation. 

The central gene in the synaptic 
module was A2BP1, which func- 
tions in a pathway that has been 
implicated’ in autism. The A2BP1 
protein controls what portions of 
a gene are included in the mature, 
functional mRNA during the pro- 
cess of alternative splicing. Through 
genome-wide assessment of autis- 
tic brains with decreased A2BP1 
expression, Voineagu et al. found 
A2BP1-dependent deficits in the 
RNA-splicing assembly of many 
genes. So it is possible for reduced 
expression of a single gene to affect 
a huge number of other genes that 
are responsible for normal brain 
development. Because diminished 
patterning of the cortex might arise 
from altered brain development, 
one could hypothesize that a gen- 
eral developmental dysregulation 
of transcript splicing leads to a more uniform 
development of various cortical areas, pre- 
venting proper functional specialization of 
these brain regions. This is an interesting and 
little-studied concept. 

Are the gene-expression differences reported 
here primary (genetic) or environmentally 
induced? In search of an answer, the research- 
ers’ analysed DNA data from a published 
genome-wide association study of autism* to 
test whether expression changes in the genes 
of synaptic and immune modules were due to 
specific sequences in the patients with autism. 
Compared with controls, genes that were 
involved in synaptic function (and showed 
altered RNA expression in post-mortem sam- 
ples) were also strongly associated with a genetic 
predisposition to autism, suggesting that these 
differences are probably strong contributors 
to the development of this disorder. 

But perhaps the authors’ most intrigu- 
ing finding concerns the immune module. 
Immune dysfunction has been suggested® to 
occur in autism, although not by an unbiased 
genome-wide assessment. Voineagu et al. 
now show that — unlike expression changes 
in synapse-related genes, which seem to be 
due to genetic predisposition to autism — the 
immune response of the autistic brain is prob- 
ably a non-genetic, adaptive or environmental 
process. 

These data should, however, be interpreted 
cautiously. And they do not diminish the 
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One of the strengths of Voineagu 
and colleagues’ study’ is that it pro- 
vides a framework for a testable 
model (Fig. 1). The main priority 
is to determine the extent and ori- 
gin of the differential splicing in 
the autistic brain, and its effect on 
the development of various brain 
regions. We must also establish 
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Figure 1 | The complexity of autism. In autism, genetic predisposition 
and environmental influences disrupt typical gene-expression patterns; 
this, in turn, alters brain development. Voineagu and colleagues” 

suggest that whereas alterations in messenger RNA splicing and synaptic 
disturbances are primarily controlled by genetics, immune responses 

in the autistic brain are either adaptive or environmental. Together, 
these deficits cause impaired connectivity — and, ultimately, altered 
behaviour, cognition and emotion. 


importance of immune-system activation in 
autism as a therapeutic target. Classification 
of primary and secondary changes in complex 
brain disorders is somewhat artificial, and one 
cannot be certain which set of changes mainly 
contributes to the symptoms. But this is not 
necessarily a problem. After all, some of the 
most common therapies — those that treat 
fever in influenza, cough in upper respiratory 
infection or pain in arthritis — target dis- 
ease symptoms (and not causes), providing 
relief for the patients. Besides, owing to high 
genetic predisposition and heterogeneity 
among patients, causal treatments for autism 
might not be possible in the foreseeable future, 
whereas therapies targeting the common, 
symptom-related molecular pathways might 
be within reach. 

But how can we establish which gene- 
expression alterations are most critical and 
how they relate to the symptoms of autism? 
Is a defect in alternative splicing more impor- 
tant than disturbances in synaptogenesis or 
immune-system induction? Can we even 
assess them independently, or are the various 
impairments intertwined? Unlike our notable 
ability to uncover autism-related molecular/ 
genetic changes, our ability to make clinical 
sense of the observed and validated changes 
is limited. Linking molecular abnormalities 
to disease symptoms and manifestations is 
particularly challenging in the absence of true 
animal models — as is the case for autism. 
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Environmental 
influences 


whether splicing dysregulation is 
limited to A2BP1-targeted genes 
alone or is more widespread. The 
causality of the various changes 
is another fascinating issue: do 
the genetics-driven, converg- 
ing synaptic alterations activate a 
detrimental immune response, or 
does the immune response have 
a more pronounced and poten- 
tiated effect when the synaptic 
genes show genetic vulnerability? 
As the authors suggest, how the 
changes they report relate to other 
neurodevelopmental disorders 
such as schizophrenia and atten- 
tion deficit hyperactivity disor- 
der should also be tested, given 
the emerging evidence’ of genetic 
overlap. 

Finally, the molecular changes 
ought to be correlated with the 
core features of the disease. The 
available data suggest that autism- 
related genetic and molecular changes are 
not present in all patients. However, there 
has not been a combined molecular—behav- 
ioural classification of autism spectrum dis- 
order into subgroups — such as ‘splicing’ 
autism, ‘synaptic’ autism and ‘inflamma- 
tory’ autism. Should such a molecular clas- 
sification become achievable (and linkable to 
specific symptoms of the disorder), it would 
revolutionize autism research, and open 
the door to developing more targeted and 
individualized therapies. m 
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ur bowels have two major roles: the digestion 

and absorption of nutrients, and the 

maintenance of a barrier against the external 
environment. They fulfil these functions in the context 
of, and with help from, tens of trillions of resident 
microbes, known as the gut microbiota. 

This Insight has as its topic the various relationships 
that contribute to keeping this complex system in 
balance, and that help to protect us from a wide 
spectrum of diseases, including chronic inflammatory 
bowel disease, colorectal cancer and metabolic disease. 
It explores the interactions between the environment 
and host genetics, between the type and amount of 
food we eat and the composition of the microbial 
community, and between the microbiota, the intestinal 
epithelium and the immune system. It also highlights 
the regulatory mechanisms that control the rapid and 
continuous renewal of epithelial cells in the intestinal 
lining from resident stem cells. It discusses the cellular 
and molecular pathways that help to maintain intestinal 
homeostasis, and explores the mechanisms that cause 
pathology and disease when these pathways fail. 

We hope that these articles will contribute to a better 
understanding of the nature of these complex networks, 
and point to future strategies that will be successful 
in paving the way for more effective preventative and 
therapeutic measures, ultimately benefitting human 
health. 

We are pleased to acknowledge the financial support 
of Yakult Honsha in producing this Insight. As always 
Nature carries sole responsibility for all editorial content 
and peer review. 


Ulla Weiss 
Insights Editor 
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Intestinal homeostasis and its breakdown 
in inflammatory bowel disease 


Kevin J. Maloy! & Fiona Powrie’” 


Intestinal homeostasis depends on complex interactions between the microbiota, the intestinal epithelium and the host 
immune system. Diverse regulatory mechanisms cooperate to maintain intestinal homeostasis, and a breakdown in 
these pathways may precipitate the chronic inflammatory pathology found in inflammatory bowel disease. It is now 
evident that immune effector modules that drive intestinal inflammation are conserved across innate and adaptive 
leukocytes and can be controlled by host regulatory cells. Recent evidence suggests that several factors may tip the balance 
between homeostasis and intestinal inflammation, presenting future challenges for the development of new therapies for 


inflammatory bowel disease. 


disorders that affect the gastrointestinal tract’. There are two main 

clinical forms of IBD — Crohn's disease, which can affect any part 
of the gastrointestinal tract, and ulcerative colitis, in which pathology 
is restricted to the colonic mucosa’. The precise aetiology of IBD 
remains unclear, but several factors that make a major contribution to 
disease pathogenesis have been identified’. These fall into three distinct 
categories: genetic factors, the host immune system, and environmental 
factors such as the gut microbiota, which is dominated by intestinal 
bacteria’. 

Ona cellular level, the dynamic crosstalk between intestinal epithelial 
cells (IECs), intestinal microbes and local immune cells represents one of 
the fundamental features of intestinal homeostasis™*. These interactions 
are not only important for the pathogenesis of IBD, but also essential for 
maintaining normal intestinal homeostasis and for mounting protective 
immunity to pathogens. In this Review, we summarize recent findings 
from disease models and clinical samples that illustrate key interactions 
and pathways that regulate intestinal homeostasis. We discuss how 
defects in epithelial barrier function, innate immune recognition or 
immune regulatory circuits may precipitate the aberrant expression of 
pathological inflammatory responses in IBD. Finally, we offer some 
perspectives on future challenges for developing therapies for IBD. 


[screen bowel disease (IBD) refers to chronic inflammatory 


Regulation by the epithelial barrier 

The intestinal epithelium represents a huge surface area of 
approximately 100 m’ that is lined by a single layer of columnar IECs, 
which form a stout physical barrier. IECs, however, form much more 
than a simple physical barrier that processes and absorbs dietary 
nutrients. They perform several other functions that are crucial 
for intestinal homeostasis’. These include secretion of compounds 
that influence microbial colonization, sampling of the intestinal 
microenvironment, sensing of both beneficial and harmful microbes, 
and induction and modulation of immune responses. To fulfil such 
diverse functions, the intestinal epithelium has unique anatomical and 
cellular adaptations, and IECs comprise several specialized cell types 
with distinct functions*. In addition, IECs do not regulate intestinal 
homeostasis in a solely intrinsic fashion, but instead function as part ofa 
coordinated response to signals provided by the commensal microbiota 
and from local leukocyte populations™*. 


The mucosal surfaces of the body, including the intestine, are coated 
by a thick mucus gel comprising an outer layer of secreted mucins 
overlying a dense inner glycocalyx of membrane-anchored mucins 
that is inaccessible to most bacteria**. In addition to providing a 
biophysical barrier, mucus forms a matrix that allows the retention of 
high concentrations of antimicrobial molecules, such as defensins and 
secretory IgA, close to the epithelial surface. The mucus layer has a 
crucial role in intestinal homeostasis, as decreased levels of goblet cells, 
leading to reduced mucin secretion, are a hallmark of human IBD, and 
mice lacking the major mucin protein MUC2 develop spontaneous 
colitis’**. 

IECs also regulate the colonization and penetration of the epithelium 
by luminal microbes through the secretion of antimicrobial peptides 
(AMPs), which include lysozymes, defensins, cathelicidins, lipocalins 
and C-type lectins such as ReglIIy*. Although some AMPs are produced 
constitutively by many IECs, others are secreted in an inducible fashion 
by Paneth cells — a type of specialized IEC located at the base of the 
intestinal crypts of the small intestine. In mice, Paneth cells have been 
implicated in protection against intestinal pathogens, as well as in 
limiting colonization by commensal bacteria’. Several lines of evidence 
suggest that Paneth cell dysfunction and impaired defensin secretion 
may contribute to IBD susceptibility. Patients with ileal Crohn’s disease 
or those with NOD2 (also known as CARD 15)-susceptibility alleles had 
reduced a-defensin expression, and genetic variants in the transcription 
factor TCF4, which is involved in Paneth cell maturation and function, 
have recently been associated with ileal Crohn’s disease’. 


Autophagy and ER stress control epithelial homeostasis 

Paneth cell abnormalities have also been reported in patients 
with Crohn's disease who are homozygous for the T300A disease- 
risk allele of the autophagy gene ATGI6L1, and in mice rendered 
hypomorphic for the ATG16L1 protein’. Autophagy is a fundamental 
process that controls the catabolism of intracellular constituents 
in response to stress or infection, characteristically involving the 
formation of autophagosomes that target cargo for degradation by 
the lysosomal machinery*. Autophagy affects host immune responses 
on several levels and has a vital role in cell-intrinsic defence against 
intracellular infections®. Defects in ATG16L1 led to the accumulation 
of morphologically abnormal granules in Paneth cells, suggesting 
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that the secretory granule pathway is impaired, although ATG16L1- 
deficient Paneth cells also expressed higher levels of pro-inflammatory 
mediators’. 

Similarly, several recent studies have identified a link between 
endoplasmic reticulum (ER) stress and the consequent unfolded 
protein response (UPR) and IBD’. The maintenance of functional 
secretory cells requires coordinated protein folding and trafficking of 
secretory proteins by the ER-Golgi network. The UPR is elicited by the 
accumulation of unfolded or misfolded proteins in the ER, thus highly 
secretory cells, such as goblet cells and Paneth cells, are very susceptible 
to ER stress, and a functional UPR is required to maintain epithelial 
homeostasis in the gut”. Mice containing a genetic deletion in the IEC- 
restricted isoform of the UPR effector enzyme IRE1 showed increased 
susceptibility to colitis induced by dextran sulphate sodium (DSS) 
administration. Activation of IRE] results in splicing and activation 
of the transcription factor XBP1, and mice with a conditional Xbp1 
deletion in IECs developed spontaneous enteritis that showed many 
characteristic features of human IBD”. Deletion of Xbp1 in mouse 
IECs led to a loss of Paneth cells, a significant reduction in goblet cells 
and hyper-responsiveness of IECs to pro-inflammatory signals”. 
Other genetic lesions that result in increased ER stress in the intestinal 
epithelium also predispose individuals to intestinal inflammation’. 

The ER stress pathway is also relevant to human gastrointestinal 
diseases, as increased ER stress is observed in the intestinal epithelium 
of patients with IBD, and polymorphisms within ER stress response 
genes, including XBP1, AGR2 and ORMDL3, are associated with 
susceptibility to both Crohn’s disease and ulcerative colitis’. The 
degree of ER stress within the intestinal epithelium may be modulated 
by environmental factors derived from the host or from the intestinal 
microbiota. Pro-inflammatory conditions and cytokines such as tumour 
necrosis factor-a (TNF-a) exacerbate ER stress, whereas the anti- 
inflammatory cytokine interleukin-10 (IL-10) reduces it’. Furthermore, 
there is evidence to suggest that the ER acts as a source for the 
autophagosome membrane and that the UPR can activate autophagy’, 
indicating that these processes cooperate during infection or stress 
of the intestinal epithelium. Taken together, these studies indicate 
that defects in specialized secretory IEC populations, or aberrant ER 
stress or autophagy responses in IECs, greatly predispose people to the 
development of intestinal inflammation. 

IECs may also influence intestinal homeostasis through the secretion 
of conditioning cytokines that affect adaptive responses primed by 
intestinal dendritic cells’. In the healthy intestine, these conditioning 
factors help to maintain a state of hyporesponsiveness towards 
commensal bacteria. For example, cytokines constitutively expressed by 
IECs, such as thymic stromal lymphopoietin and IL-25, limit dendritic- 
cell production of the p40 subunit of IL-12 and IL-23 and promote IL-10 
secretion, impeding the priming of T helper 1 (T,,1)-cell responses and 
instead favouring the induction of T regulatory (T,,..)-cell and T,,2-cell 
responses”"’. Conversely, after sensing pathogenic invasion or damage, 
IECs can elaborate the secretion of pro-inflammatory chemokines, 
such as IL-8 (also known as CXCL8), which have an important role 
in alerting the immune system to microbial attack”. IECs also exert 
a strong influence on local antibody responses by producing factors 
such as transforming growth factor-f (TGF-B), B-cell activating factor 
(BAFE, also known as TNFSF13B) and a proliferation-inducing ligand 
(APRIL, also known as TNFSF13), which promote class-switching of 
B cells towards the production of IgA’. IECs mediate the transport of 
secretory IgA into the mucus layer, where it has a complementary role 
to innate defences in limiting the penetration of commensal bacteria 
across the epithelium’*™*, 


Pattern recognition receptors and intestinal homeostasis 
Accumulating evidence indicates that microbial sensing through 
pattern recognition receptors (PRRs) drives complementary functions 
in IECs and haematopoeitic cells, which together control intestinal 
homeostasis’*’*'®. The context of PRR activation is crucial. In the 
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healthy intestine, basal PRR activation maintains barrier function 
and commensal composition, but aberrant PRR signalling may bea 
central contributor to the pathophysiology of IBD. The latter point is 
emphasized by genetic-association studies linking PRR genes, including 
NOD2, NLRP3 and various Toll-like receptor (TLR) genes, with IBD 


susceptibility, although the mechanisms responsible remain unclear’*”*. 


Tonic PRR signals maintain a healthy epithelium 

The importance of TLR signalling in regulating epithelial barrier 
function has been shown by studies using the DSS-colitis mouse model, 
in which DSS administration results in chemical destruction of the IEC 
layer and penetration of commensal bacteria, leading to acute colitis 
followed by restitution and repair of the epithelial barrier. Mice that 
lack specific TLRs, such as Tlr2, Tir4, Tir5 or TIr9, or that are deficient 
in the shared TLR signalling adaptor protein MyD88 show increased 
susceptibility to DSS colitis, characterized by defective tissue repair and/ 
or increased mortality'”'®. TLR signals drive intrinsic protective effects 
in IECs by inducing several proliferative and anti-apoptotic factors and 
by promoting epithelial restitution and fortifying intercellular tight 
junctions'*”*. Intrinsic TLR signals in IECs also have a central role in 
limiting bacterial colonization and translocation by stimulating IEC 
production of AMPs, such as defensins and ReglIIy*"’. An elegant 
study in which MyD88 expression was selectively limited to Paneth 
cells showed that these cells sense commensal bacteria directly through 
TLRs, and that this sensing induced AMP production that limited 
bacterial translocation across the intestinal mucosa’. 

The activation of cytosolic NOD-like receptors (NLRs) may also 
be important in maintaining barrier function, as mice lacking Nod1 
or Nod2, or bearing a Crohn’s-disease-associated mutant allele of 
human NOD2, showed defects in defensin secretion and increased 
susceptibility to DSS colitis'”"*. Recent studies have indicated that NLR- 
mediated inflammasome activation also contributes to protection 
after damage of the epithelium, because mice deficient in NLRP3, 
its adaptor ASC or caspase-1 showed enhanced colitis and mortality 
after DSS administration’ *'. Inflammasomes are multimolecular 
complexes that activate caspase-1. They are formed after the activation 
of cytosolic NLRs, which then associate with caspase-1, often through 
interactions with adaptor proteins such as ASC”. Activated caspase-1 
has a central involvement in the processing and secretion of two key pro- 
inflammatory cytokines, IL-1 and IL-18, which in turn bind to receptors 
that use MyD88 asa signal-transduction adaptor”. The administration 
of exogenous IL-18 attenuated DSS colitis in Casp1™” mice (which lack 
caspase-1)'””®, and I118" mice and I118r1~™ mice (which lack the IL-18 
receptor) showed increased susceptibility to DSS colitis, highlighting a 
role for IL-18 in epithelial barrier integrity and repair”. 


Sustained PRR signals drive intestinal inflammation 
In contrast to the protective barrier responses elicited by tonic PRR 
signals in IECs, studies in mouse models of chronic colitis have 
demonstrated that sustained PRR activation drives chronic intestinal 
inflammation. Thus, MyD88 signals were required for spontaneous 
colitis development in 1110” mice’*"*. A key role for MyD88 signals in 
haematopoietic cells was indicated by experiments in which selective 
ablation of MyD88 rendered mice refractory to intestinal inflammation 
induced by Helicobacter hepaticus’*. Although these results do not 
exclude a contribution from PRR signals in IECs to inflammatory 
responses in the gut, they indicate that such responses alone are 
insufficient to drive chronic inflammatory pathology. Similarly, recent 
studies have demonstrated that T-cell-intrinsic MyD88 signals are 
not essential for the expression of pathogenic effector or regulatory 
functions in the intestine’’, whereas MyD88 signals in dendritic cells 
during the sensing of intestinal microbiota were shown to be essential 
for T-cell proliferation and intestinal pathology”. 

Patients with IBD, particularly those with ulcerative colitis, have a 
greatly increased risk of developing colitis-associated cancer. Studies 
of the role of PRR signals in colitis-associated cancer, however, have 
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produced conflicting results*”’. Although sustained PRR-driven 
inflammatory responses can exacerbate intestinal tumorigenesis, their 
role in epithelial barrier maintenance and repair is protective against the 
development of intestinal tumours”. It is also worth noting that a few 
studies have reported protective effects of PRR signals against chronic 
intestinal inflammation, again emphasizing that some PRR signalling 
can be beneficial in the gut. For example, bacterial polysaccharide A 
derived from the human commensal bacterium Bacteroides fragilis was 
able to protect mice from T-cell-mediated colitis in a TLR2-dependent 
manner through the induction of T,,. cells*. 


Integration of bacterial handling and stress responses 

Microbes trigger a diverse range of PRRs and cellular stress responses 
that do not operate in isolation, but are integrated by the cell to direct 
appropriate effector responses. Thus, defects in one PRR pathway may 
influence other PRR signalling cascades, as well as affecting other 
processes implicated in intestinal homeostasis, such as autophagy and 
ER stress. 

For example, Crohn’s-disease-associated mutations in the NOD2 
gene are mainly located in the leucine-rich-repeat region that mediates 
sensing of the peptidoglycan motif muramyl dipeptide (MDP), resulting 
in reduced activation of nuclear factor-xB (NF-«B)’”. It has been 
proposed that NOD2 could act by attenuating TLR signalling that drives 
excessive activation of dendritic cells and T,,1-cell responses”. Further 
evidence that NOD2 can regulate TLR signalling in the gut came from 
a mouse model of necrotizing enterocolitis, in which activation of 
NOD2 by MDP inhibited TLR4 signalling in IECs and ameliorated the 
condition”. 

There is an increasing appreciation that PRR signals intersect with 
other bacterial-handling and cellular-stress processes to coordinate 
protective and inflammatory responses. For example, TLR signals 
can induce autophagy, and this has been reported to enhance the 
clearance of microbes’. Recent studies have found that NOD2 stimulates 
autophagy by interacting directly with ATG16L1, which allows the 
recruitment of ATG16L1 to sites of bacterial entry’. Furthermore, 
dendritic cells expressing Crohn’s-disease-associated mutant forms of 
NOD2 or ATG16L1 showed reduced autophagy in response to MDP, 
and this led to impaired antigen presentation and bacterial killing”. 

The regulation of inflammasome activation by autophagic pathways 
has been reported, with selective ablation of ATG16L1 in mouse 
haematopoietic cells leading to increased inflammasome activation and 
IL-1 secretion in response to lipopolysaccharide*. These mice were also 
highly susceptible to DSS colitis, which was reversed by the neutralization 
of IL-1 and IL-18 (ref. 32). The mechanism involved is not clear, but 
autophagy has been reported to inhibit the generation of reactive oxygen 
species (ROS), especially by dysfunctional mitochondria, which have 
been shown to trigger the activation of NLRP3 inflammasomes™. 
Autophagy may also inhibit pyroptosis, a highly inflammatory form 
of caspase-1-dependent cell death that has been observed in myeloid 
cells infected with intracellular pathogens*. In addition, ROS generation 
has been shown to stimulate autophagy that restricted the replication of 
Salmonella enterica serovar Typhimurium (S. Typhimurium) in IECs, 
suggesting that ROS-induced autophagy may act as a negative-feedback 
mechanism to limit caspase-1-driven inflammatory circuits, while 
providing a complementary mechanism of bacterial defence”. Recent 
studies have also reported interactions between TLR and ER stress 
pathways, although both inhibition and activation of distinct arms of 
the UPR have been observed after TLR stimulation”, 

Taken together, these studies indicate that diverse PRR signals interact 
with the autophagy and ER stress pathways to coordinate bacterial 
handling and inflammatory responses, and suggest that deficiencies or 
perturbations of these networks could contribute to IBD pathogenesis 
(Fig. 1). A better understanding of how these networks function in IECs 
and leukocytes in the healthy and inflamed intestine may give rise to 
new therapeutic avenues for IBD, and could also reveal strategies for 
boosting mucosal barrier defences in immune-suppressed individuals. 
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Figure 1 | Bacterial sensing and cellular stress pathways in intestinal 
homeostasis. a, Bacterial sensing and handling pathways cooperatively 
maintain the intestinal epithelial barrier. In IECs, basal sensing of microbial 
pathogen-associated molecular patterns by PRRs such as TLRs maintains 
epithelial barrier function by stimulating AMP expression (green arrows), and 
by fortifying tight junctions and inducing the release of protective cytokines 
such as IL-18 (red arrows). In addition, intrinsic PRR signals in IECs stimulate 
anti-apoptotic and proliferative responses (black arrows). The autophagy 
pathway, induced by PRR signals or by the ER stress response (blue arrows), 
cooperates with PRRs to promote the secretion of AMP and mucins (green 
arrows), and also constitutes an important cell-intrinsic defence mechanism 

in IECs for bacterial clearance. Therefore, defects in PRR sensing, the UPR 

or the autophagy pathways can result in impaired barrier function, leading to 
increased bacterial colonization and translocation and eliciting an exacerbated 
inflammatory response. b, Aberrant PRR signals in haematopoeitic cells drive 
chronic inflammation in IBD. PRR signals in dendritic cells and macrophages 
drive chronic inflammatory responses in the gut through the activation of 
NF-«B-dependent pro-inflammatory cytokines (such as IL-23, IL-6 and TNF-a) 
and caspase-1 (CASP1)-mediated induction of IL-1, IL-18 and pyroptosis (red 
arrows). The magnitude of these inflammatory responses may be tempered by 
NOD2 suppression of TLR signals (green line) and stimulation of the autophagy 
pathway by NOD2, TLRs or ROS induction (blue arrows). Autophagy may 
attenuate ROS production and CASP1 activation (black lines), thus limiting 
inflammatory responses. Therefore, defects in NOD2 or autophagy pathways 
may contribute to the excessive inflammatory responses observed in IBD. 
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Pathological effector modules in the gut 

There are a multitude of animal models of IBD that either arise 
spontaneously or are induced by various experimental manipulations, 
which reproduce distinct features of human IBD. However, there is 
no perfect experimental model, because patients with IBD present 
a heterogeneous spectrum of pathological features that reflect the 
participation of a diverse range of innate and adaptive immune effectors. 
This heterogeneity is further underscored by the recent observations 
that around 100 distinct genetic loci may contribute to IBD susceptibility 
(see page 307), and the key target of these aberrant immune responses, 
the gut microbiota, is unique to each individual’. It is therefore likely 
that there will be several aetiologies of human IBD, and these may reflect 
aberrant expression of distinct immune modules. In this context, an 
immune module is used to define an effector response coordinated by 
a group of cytokines that may be produced by innate and/or adaptive 
leukocytes. These distinct immune modules evolved to protect against 
the different types of challenge posed by diverse pathogens that target 
the gastrointestinal tract, and some immune pathology associated with 
their expression is one ‘cost’ of a functional immune system (Fig. 2). 


TNF-a and IL-6 

The central role of innate myeloid cells in IBD is mirrored by the potent 
pro-inflammatory effects of the cytokines that they secrete, particularly 
TNF-a and IL-6. The successful application of anti-TNF-a antibody 
treatment signified a major breakthrough in the treatment of IBD”, and 
resulted directly from convergent data in animal IBD models indicating 
a role for TNF-a in chronic intestinal inflammation’. IL-6 is increased 
in the inflamed intestinal mucosa, and blockade of IL-6 signalling 
ameliorated colitis in mouse models and also had beneficial effects 
in a clinical trial of patients with Crohn’s disease’. Despite the success 
of anti- TNF-a biologics in IBD, approximately one-third of patients 
do not respond to anti- TNF-a treatment, and many others eventually 
lose responsiveness or become intolerant to these agents”. In addition, 
patients treated with anti- TNF-a show an increased incidence of severe 
infections and malignancies, emphasizing the need for further therapies 
that may target intestinal inflammation more selectively”. 


IL-23, T,,1 and T,,17 responses 

Early studies on the production of T-cell-derived cytokines suggested 
a role for IL-13-producing natural killer T (NKT) cells in ulcerative 
colitis, and the differential activation of IL-12 p40 and T,1-cell 
responses was associated with Crohn's disease’. The colitis-attenuating 
effects observed in mice lacking the I/12b (also known as I/-12p40) 
gene, or given neutralizing antibodies directed against IL-12 p40 or 
interferon-y (IFN-y), also emphasized a key role for T,;1-cell responses 
in intestinal inflammation’. However, the discovery that another IL-12- 
p40-containing heterodimeric cytokine, IL-23, was the central driver 
in several autoimmune pathologies prompted analysis of the role of 
IL-23 in intestinal inflammation. Studies in several mouse IBD models 
have used selective targeting of the IL-23 p19 subunit to demonstrate 
that IL-23 plays a key part in chronic intestinal pathology”. These 
findings were quickly followed by genome-wide association studies 
(GWAS) that reported strong associations of polymorphisms in the 
IL23R and IL12B gene loci with Crohn's disease and ulcerative colitis 
(see page 307). IL-23 is induced by PRR stimulation and is constitutively 
expressed in a small population of dendritic cells present in the lamina 
propria of the terminal ileum, although in patients with Crohn's disease, 
CD14" intestinal macrophages have also been reported to secrete large 
amounts of IL-23 (ref. 37). The factors that determine whether an 
activated dendritic cell will preferentially produce IL-23 or IL-12 are 
not clear, but a recent study showed that ER stress and activation of 
the UPR can synergize with TLR signals to selectively increase IL-23 
expression by dendritic cells*. Although IL-23 was initially linked to the 
preferential expression of T,,17 responses, it can promote a wide range of 
pathological responses in the intestine, mediated either by T cells or by 
excessive innate immune activation®®”’. IL-23-mediated enhancement 
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of Ty] and T,17 responses is consistent with the increased levels of 
IFN-y, IL-17 and IL-22 observed in the chronically inflamed intestine”. 
Transcription factors that direct T,,1-cell or T,,17-cell responses — such 
as T-bet (also known as T-box protein 21 and TBX21) or retinoic-acid- 
receptor-related orphan receptor-yt (RORyt), respectively — were 
shown to be essential for T-cell-mediated colitis’”°. T-cell-intrinsic 
IL-23R signals favour the expression of pathogenic pro-inflammatory 
T-cell responses in several ways, including enhanced proliferation of 
effector T cells, reduced differentiation of FOXP3* T,,., cells and the 
emergence of IL-17‘IFN-y*CD4' T cells”. The precise origins, stability 
and pathogenic properties of these IL-17* IFN-y*CD4" T cells remain to 
be determined. Notably, IL-17°IFN-y"CD4" T cells have been isolated 
from the inflamed lamina propria of patients with Crohn’s disease, and 
were shown to respond to IL-23 and to be derived from a discrete subset 
of CD161*°CD4' T cells that express chemokine receptor 6 (CCR6) and 
the nuclear receptor RORyt*”™. In phase II clinical trials, anti-IL-12-p40 
monoclonal antibodies have shown clinical efficacy in a subset of patients 
with Crohn's disease, particularly in those who had not responded to 
anti-TNF-a therapy, suggesting that TNF-a and IL-12 (or IL-23) drive 
distinct pathways of immune pathology”. 

The relative enrichment of T,;17 cells at mucosal sites, together with 
the increased levels of T,17 cytokines in the inflamed gut, has fuelled 
interest in their potential role in IBD pathogenesis*®*”’. T,,17 cells 
produce several cytokines, including IL-17A, IL-17E, IL-21 and IL-22 
(ref. 36). Many studies have focussed on the roles of IL-17A and IL-17F, 
which are known to have pro-inflammatory effects in tissues such as the 
lung and brain, through the elaboration of cytokines and chemokines, 
particularly those that promote neutrophil recruitment”. Analyses of 
IL-17A and IL-17F in mouse colitis models have produced conflicting 
results. In acute DSS colitis, IL-17A has a protective role, whereas IL-17F 
seems to exacerbate disease’. By contrast, the neutralization of IL-17A 
attenuated chronic colitis in mice with Stat3-deficient T,,., cells“ and 
decreased innate immune colitis after H. hepaticus infection”. Studies 
in T-cell-transfer colitis models suggest that IL-17A and IL-17F can 
have redundant pro-inflammatory effects in the gut”. T,,17 responses 
were also recently implicated in a model of colitis-associated cancer, 
because IL-17A depletion reduced colitis and tumour development”. 
The microbiota has an important role in the preferential localization 
of T,17 cells in the gut, as the colonization of germ-free mice with 
segmented filamentous bacteria led to the marked accumulation of 
T,17 cells in the intestinal lamina propria”. 

Another T,,17 cytokine that is highly dependent on IL-23 is IL-22, 
which enhances the innate immunity of tissues. Expression of the 
IL-22R complex is restricted to non-haematopoietic cells, especially 
epithelial cells in the skin, gut and lungs”. IL-22 signalling in IECs drives 
the production of AMPs and also promotes epithelial regeneration and 
healing by activating the transcription factor STATS (ref. 50). Consistent 
with this epithelial-protective role, IL-22 administration attenuated 
disease severity in the DSS and T-cell receptor-a (Tcra’-) mouse IBD 
models, by restoring goblet cells and mucus production”. By contrast, 
other studies support a pathogenic role for IL-22 in IBD, as its expression 
is increased in patients with Crohn's disease, and high serum IL-22 levels 
correlate with increased disease activity and susceptibility-associated 
IL23R polymorphisms”. A recent study of bleomycin-induced airway 
inflammation showed that IL-22 could mediate either tissue-protective 
or pathogenic functions, depending on the absence or presence of 
IL-17A, respectively”. Thus, further studies are required to determine 
whether IL-17A or other pro-inflammatory mediators can have similar 
modulating effects on IL-22 activity in the gut. Although less extensively 
studied, IL-21 may also regulate intestinal inflammation, through effects 
on Ty17 cells and the production of matrix metalloproteinases, which 
are involved in tissue remodelling®®. In summary, although it is clear that 
Ty1l7 cytokines are important in many aspects of intestinal homeostasis 
and protection from mucosal pathogens, their role in IBD pathogenesis 
remains ambiguous, and further investigations are necessary to clarify 
their potential for therapeutic intervention. 
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NLRs and inflammasome-associated cytokines 

Recent advances identifying a central role for inflammasomes and NLRs 
in autoinflammatory diseases” — together with the association of IBD 
with polymorphisms in NLRP3 and ILI8RAP — have rekindled interest 
in the potential roles of IL-1B and IL-18 in IBD. Levels of IL-1B and 
IL-18 are increased in IBD’, and II18”” mice were resistant to colitis 
induced by trinitrobenzene sulphonic acid™, suggesting that IL-1 
and IL-18 enhance chronic intestinal pathology. This hypothesis is 
consistent with the ability of IL-1B and IL-18 to promote T,17 and Ty1 
responses, respectively” 352 and with studies indicating that ATG16L1 
and a-defensins negatively regulate IL-1 expression”. Furthermore, 
this inflammatory axis is important in responses to gut pathogens, as 
ASC-mediated IL-1 production has an essential role in Clostridium 
difficile toxin-induced intestinal pathology”, and both IL-18 and 
IL-18 were required for the induction of intestinal inflammation after 
infection with S. Typhimurium™. Thus, inflammasome-forming NLRs 
can contribute to intestinal pathology through IL-1 and IL-18, and 
further studies are required to define their roles in IBD”. 


Conserved innate and adaptive effector modules 

Although predominantly attributed to CD4* T,, cells, many innate 
leukocyte populations, including y6 T cells, NKT cells and natural 
killer cells, can secrete T,,1 and T,,17 cytokines such as IFN-y, IL-17A 
and IL-22 (refs 36, 49, 55-57). In particular, yd T cells can express the 
T17-associated transcription factors RORyt and the aryl hydrocarbon 
receptor (AHR), as well as the homing receptor CCR6, and can secrete 
IL-17 and IL-22 in response to IL-23 and IL-1B*”. 

However, recent studies have converged on the identification of 
several new innate lymphoid cell (ILC) populations present in the gut 
that can produce these pro-inflammatory cytokines®°*. Although 
many functionally heterogeneous ILC populations have been described, 
their phenotypic characteristics suggest that they are related to natural 
killer cells and lymphoid tissue inducer (LTT) cells***. Indeed, a 
recent cell-fate-mapping study suggested that several natural-killer- 
like and LTI-like ILC subsets were derived from a common RORyt* 
precursor”, and functional specialization is coordinated by cytokines 
and microbiota-dependent signals that direct the expression of distinct 
transcription factors***™. In terms of IBD, we identified a population 
of CD90* CD4 ILCs that accumulated in the inflamed colons of 
Rag’ mice infected with H. hepaticus*. These cells expressed high 
levels of IL-23R and RORyt and produced IFN-y, IL-17A and IL-22 in 
response to IL-23, and their depletion with an anti-CD90 monoclonal 
antibody led to the attenuation of typhlocolitis®. A similar population 
of IL-23-responsive CD90*CD4" LTI cells was recently shown to be 


302 | NATURE | VOL 474 | 16 JUNE 2011 


Figure 2 | Conserved innate and adaptive 
immune effector modules in the gut. IECs and 
intestinal dendritic cells sense distinct infectious 
agents, leading to the production of factors that 
direct different effector responses (black arrows). 
Local innate leukocyte populations can rapidly 
produce these effector cytokines to restrict 
pathogen growth until specific adaptive responses 
have been induced. Cells and molecules associated 
with distinct effector profiles are colour-coded; 
Tyl type (blue), T,,17 type (red), mixed T,1 and 
Ty17 (brown), T,2 type (purple), and aberrant or 
prolonged expression of any of these modules may 
contribute to chronic intestinal inflammation. 
Regulatory T-cell circuits (green) can suppress all 
types of inflammatory effector response and may 
enhance the production of protective secretory 
IgA (sIgA) antibodies. 17*y*, IL-17- and IFN- 
y-secreting CD4" T cell; ILC2, type-2 ILC; NK, 
natural killer; RA, retinoic acid; TSLP, thymic 
stromal lymphopoietin. 
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an essential source of protective innate IL-22 during the initial phase 
of Citrobacter rodentium infection”. Other studies have identified 
further populations of innate leukocytes in the gut and mesenteric 
adipose tissues that secrete large amounts of the T|,2 signature cytokines 
IL-4 and IL-13 (refs 63-65). These type-2 ILCs are RORyt and were 
rapidly activated in response to IEC-derived IL-25 and IL-33 after 
infection with parasitic helminth worms, an infection in which T,,2 
cytokines are key contributors to protective immunity. Although 
the precise relationships of these cells to one another, and to the 
RORyt* ILC populations discussed earlier, remain to be determined, 
these results further emphasize that various populations of intestinal 
innate leukocytes can rapidly respond to different types of infection by 
producing appropriate effector cytokines. 

The emerging data on shared innate and adaptive cytokine profiles 
suggest that these conserved immune modules evolved before adaptive 
lymphocytes, and that protective immunity can be mediated by 
sentinel tissue-resident innate leukocytes early after infection, whereas 
subsequent T-cell responses add memory and specificity to the relevant 
protective axis (Fig. 2). 


Adaptive regulation of intestinal inflammation 

The intestine contains an extensive network of dendritic cells and 
macrophages that has an important role in shaping adaptive immunity 
in response to intestinal environmental cues”. Under homeostatic 
conditions, both dendritic cells and macrophage populations have 
specific adaptations that promote tolerance. During infection, however, 
responses shift to a more inflammatory nature, which can lead to 
immune pathology when dysregulated. 


Antigen-presenting myeloid cells 

Intestinal myeloid antigen-presenting cell (APC) populations are 
heterogeneous in terms of phenotype, function, developmental origin 
and anatomical location”. Recently, two major populations of 
intestinal dendritic cells have been identified on the basis of differential 
expression of the integrin subunit CD103 and the chemokine receptor 
CX,CRI1. CD11c"*" CD103* dendritic cells share developmental origins 
with lymphoid tissue dendritic cells and are derived from pre-dendritic 
cells without a monocyte intermediate™”. By contrast, monocytes give 
rise to intestinal CD11c"®"CD103 CX;CRI* dendritic cells, suggesting 
a close relationship between these cells and CX,CRI1"* intestinal 
macrophages. CD103* dendritic cells are dispersed throughout the 
lamina propria and in organized lymphoid structures. In the small 
intestine, they act as important sentinel cells as they can take up 
pathogenic and commensal bacteria, as well as innocuous antigens or 
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apoptotic IECs. After maturation, CD103* dendritic cells migrate to the 
draining mesenteric lymph node (MLN), where they initiate adaptive 
responses” focused on the intestine, including upregulation of the 
gut-homing receptors CCR9 and a,,8,-integrin on activated T cells, and 
IgA class-switch recombination by intestinal B cells”. These properties 
depend on the production of the vitamin A dietary metabolite retinoic 
acid, and CD 103° dendritic cells express the retinal-metabolizing 
enzyme genes ALDH1A1 and ALDH1Az2, although it is not known 
whether CD103* dendritic cells are an essential functional source of 
retinoic acid in vivo. 

CD103* dendritic cells preferentially induce tolerance pathways, 
including FOXP3* T,,., cells in the draining MLN by a TGF- B-dependent 
and retinoic-acid-dependent mechanism”. The intestinal pathways that 
influence CD103* dendritic cell function in vivo are poorly understood, 
but they do not seem to depend on the microbiota”. A recent study 
showed that ablation of §-catenin expression in mouse dendritic cells 
led to reduced frequencies of T,., cells and higher frequencies of effector 
T,1 and T,;17 cells in the intestinal lamina propria”. This correlated 
with reduced messenger RNA levels of 1/10, Tgfb, Aldh1a1 and Aldh1a2, 
resulting in enhanced susceptibility to DSS colitis. How B-catenin 
expression is regulated in intestinal dendritic cells is not known. 

The functional properties of CD103* dendritic cells are not 
hardwired, and they acquire inflammatory properties during intestinal 
inflammation such as the ability to produce IL-6 and drive Ty1 
responses”*. Thus, migratory CD103* dendritic cells can promote both 
tolerogenic and effector T-cell responses, and further work is required 
to identify quantitative and qualitative factors that drive intestinal 
dendritic cell conditioning and the effect of these factors on adaptive 
effector pathways. 

CX,CR1*CD103° APCs comprise a heterogeneous population of 
dendritic cells and macrophages. CD11c*CX;CRI1* dendritic cells 
are present adjacent to the intestinal epithelium and can extend 
processes through the epithelium to sample antigens and bacteria®”. 
However, CD11c*CX,;CR1* dendritic cells do not seem to migrate to 
the MLN and fail to prime naive T cells, suggesting that their main 
role may be to modulate local adaptive intestinal responses”. CX,CR1* 
APCs accumulate in response to microbiota-derived signals”, and 
a CD70"*" subset promotes colonic T,,17 responses in response to 
commensal-derived ATP”. Colonic macrophages contribute to 
intestinal homeostasis in several ways. As highly phagocytic cells, 
they clear apoptotic cells and debris and contribute to wound repair of 
the epithelium”. Intestinal macrophages have adaptations to prevent 
excessive inflammatory responses towards the intestinal flora, including 
expression of inhibitors of NF-«B signalling that permit bactericidal 
activity in the absence of TLR-driven pro-inflammatory cytokine 
production”. Recent evidence suggests that these cells also promote 
tolerance in part through IL-10 production and maintenance of FOXP3 
among colonic T,., cells”. A similar APC population in the small 
intestine induced FOXP3" T,,, cells in vitro and promoted T,,..-cell 
proliferation ina CX,CR1-dependent manner”. 

In IBD and experimental colitis, there is an increase in dendritic 
cell and macrophage populations that may contribute to intestinal 
pathology through pro-inflammatory cytokine production”. Although 
it is not yet fully established whether this reflects changes in resident 
myeloid cell populations or the accumulation of newly recruited cells, 
there is evidence to support the latter. Thus, acute and chronic mouse 
colitis models were associated with a marked increase in recruited 
monocyte-derived dendritic cells that produced IL-12, IL-23 and TNF-a 
and showed enhanced TLR responsiveness®”’. A similar population of 
inflammatory macrophages that promote colonic inflammation has 
also been described”. The data are consistent with a model in which 
sustained pro-inflammatory cytokines and chemokines promote 
myelopoiesis — the mobilization of monocytes from the bone marrow 
to the blood and the recruitment of inflammatory monocytes to the 
inflamed intestine. These recruited myeloid cells lack gut-specific 
adaptations associated with tolerance and instead mediate inflammatory 
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responses after microbial challenge. Further understanding of the 
factors that control the recruitment and function of myeloid cells in 
intestinal inflammation may provide new therapeutic targets. 


Regulatory T-cell populations 

Although various T-cell populations have anti-inflammatory functions, 
FOXP3* T,,., cells and FOXP3” IL-10-secreting CD4* T cells are 
particularly important in the intestine”. Most of the former acquire 
FOXP3 expression in the thymus and represent a functionally distinct 
population that has a non-redundant role in controlling immune 
homeostasis. Deletion or loss-of-function mutations in the gene 
encoding FOXP3 result in a fatal inflammatory disease in mice, and 
in immune dysregulation, polyendocrinopathy, enteropathy, X-linked 
(IPEX) syndrome in humans, which is often accompanied by intestinal 
inflammation”. FOXP3* T,.g cells are abundant in the small intestine 
and colon, where they control potentially deleterious responses to 
dietary and microbial stimuli”. In addition to thymic-derived T,,.. cells, 
the intestine is also a preferential site for TGF-B-dependent induction 
of FOXP3* T,., cells from naive CD4* T-cell precursors”. Such antigen- 
induced T,,,, cells are expanded in models of oral tolerance and can 
control local and systemic antigen-induced hypersensitivity responses. 
Little is known about the antigen specificity of intestinal T,,, cells that 
control microbiota-driven responses. However, T,,,-cell accumulation 
in the colon is reduced in germ-free mice and can be increased by 
particular indigenous bacteria, suggesting a role for the microbiota in 
promoting intestinal T,,,-cell responses”. 

Induced T,,,-cell and T,,17-cell populations seem to be reciprocally 
regulated in the intestine. Although TGF- is required for the 
differentiation of both populations, the presence of STAT3-mediated 
signals (such as IL-6 or IL-23) promotes T,;17 cells at the expense of 
FOXP3* Tg cells”*'. Such a mechanism allows the inflammatory 
response to override T,,,-cell induction in the presence of pro- 
inflammatory stimuli, promoting intestinal effector T-cell responses 
and host defence. Recent evidence suggests that bacterial components 
differentially affect this balance, providing potential therapeutic 
strategies to influence tolerance and immunity in the gut”. 

An important component of T,.,-cell-mediated control of intestinal 
homeostasis is their ability to survive and compete with effector T cells 
in the intestinal niche*". This has recently been shown to involve the 
expression of co-stimulatory pathways and transcription factor modules 
associated with the colitogenic response’. For example, mice with 
a Stat3 deletion in FOXP3" T,,, cells develop aggressive colitis owing 
to uncontrolled T,,17 responses“. In addition to STAT3, Treg Cells 
can express several transcription factors associated with particular 
effector responses, including T-bet, IRF4 and GATA3 (ref. 81). Under 
homeostatic conditions these allow T,,,,-cell-mediated control of distinct 
effector modules. However, the system is delicately poised and can 
sometimes lead to T,,,-cell instability. For example, high-level T-bet 
expression in the presence of acute intestinal infection drives T,,.. cells 
into an inflammatory IFN-y-secreting phenotype™. 


The immune-suppressive modules TGF-6 and IL-10 
TGE-6 is present at high concentrations in the intestine and has a crucial 
involvement in modulating the immune response”. Deletion of Tgfb1 in 
mice leads to a fatal inflammatory disease similar to FOXP3 deficiency. 
Cell-type-specific targeting of TGF-6 or its receptor has shown a key 
role for T cells in both the production and responsiveness to TGF-B 
that is required to maintain immune homeostasis”. T cells that cannot 
respond to TGF-6 escape T,.,-cell-mediated control, and T cells from 
patients with IBD are refractory to the anti-inflammatory actions of 
TGF-6 through expression of the negative regulator of TGF-6 signalling 
SMAD7 (ref. 86). Whether this is a primary or secondary event is not 
known, but it suggests that restoring TGF-B responsiveness may have 
therapeutic benefit in IBD. 

TGE-B is produced as an inactive precursor that needs to be post- 
translationally modified to become biologically active. This is a 
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tightly controlled process that has recently been shown to involve 
expression of the a,8,-integrin molecule on intestinal dendritic cells 
and macrophages”, as well as expression of the proprotein convertase 
furin on T cells*”. a,8,-Integrin has also been implicated in myeloid- 
cell uptake of apoptotic cells, a process that has been linked to TGF-8 
production®. Because IECs undergo apoptosis under physiological 
conditions, this may provide a source of TGF-f that promotes 
tolerance under homeostatic conditions. Enhanced apoptosis of 
IECs accompanies infection and inflammation, and under these 
circumstances TGF-f in the presence of pro-inflammatory cytokines 
such as IL-6 promotes the development of inflammatory T,,17 
responses”. Lastly, a recent report suggests that TGF-B-dependent 
stimulation of intestinal IgA responses is another mechanism through 
which T,,,, cells can reinforce intestinal homeostasis”. 

IL-10 is produced by a wide range of leukocytes, including T cells, 
B cells and myeloid cells, and its deletion in mice leads to colitis 
development”. CD4* T-cell-produced IL-10 is required to prevent 
intestinal inflammation, with functional contributions from both 
FOXP3* and FOXP3° CD4* cells”. The intestine contains large 
numbers of CD4*IL-10* cells; in the colon, these are mainly FOXP3’, 
whereas both FOXP3* and FOXP3 IL-10* cells are present in the small 
intestine”. Intestinal bacteria can promote the activity of colonic T,,., 
cells by inducing IL-10 production, and recent evidence suggests a 
specific role for particular Clostridium species in this process*’. Unlike 
FOXP3-expressing cells, FOXP3 IL-10* CD4" cells may represent 
a more heterogeneous mix, because most effector T-cell subsets, 
including T,,1, T,,2 and T,,17 cells, produce IL-10 after chronic 
immune stimulation”’. Myeloid sources of IL-10 are important in some 
settings, as IL-10 production by intestinal macrophages promoted 
FOXP3 T,,,-cell function in an adoptive transfer model of colitis”. 
IL-10 controls chronic intestinal inflammation partly through direct 
anti-inflammatory effects on myeloid cells”. Evidence for the role of 
IL-10 in human IBD comes from findings that mutations in the IL-10 
receptor genes ILIORA and IL10RB lead to severe early-onset IBD”. 
GWAS have also identified single nucleotide polymorphisms in IL10 
associated with susceptibility to Crohn's disease and ulcerative colitis 
(see page 307). Together, both genetic and functional studies highlight 
the importance of IL-10 in intestinal homeostasis and suggest that the 
ability of intestinal bacteria to induce IL-10 may be an important facet 
of host-commensal mutualism. 


A multihit model of IBD 

As noted above, in some cases single gene defects in crucial regulatory 
circuits, such as the IL-10 pathway”, can trigger severe IBD in infants. 
However, the heterogeneity of IBD and the low disease penetrance 
in individuals carrying disease-susceptibility alleles suggest that, 
in most patients, several host and environmental factors interact to 
cause IBD. There is evidence to suggest that Crohn’s disease stems 
from an immunodeficiency of macrophages that results in defective 
acute inflammatory responses and impaired clearance of commensal 
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Figure 3 | A multihit model of IBD pathogenesis. The 
induction and perpetuation of chronic intestinal 
pathology may require the convergence of many 
abnormalities that affect several overlapping layers of 
immune homeostasis in the intestine. Defects in one 
layer are unlikely to precipitate IBD in the absence 

of further pathogenic lesions (panels on left). These 
layers are in turn controlled by a range of cell-intrinsic 
and cell-extrinsic regulatory modules that function in 
an interrelated way to maintain homeostasis (panels 

on right). Defects in these homeostatic modules may 
predispose people to the development of IBD by 
affecting several layers of immune homeostasis (filled 
panels). The dashed line denotes the threshold at which 
the level of inflammation manifests as IBD. 


bacteria, leading to the subsequent expression of chronic granulomatous 
inflammation”. Pathogenic infections may act as triggers or 
contributing factors for IBD, and adherent-invasive Escherichia coli are 
frequently present in close association with the ileal mucosa in patients 
with Crohn’s disease’. 

It is not yet clear whether the presence of E. coli is a cause or effect of 
colitis, as recent studies have highlighted that intestinal inflammation 
can confer a selective growth advantage to certain pathogens, including 
S. Typhimurium”. A growing body of evidence suggests that IBD is 
associated with an imbalance in the composition of the intestinal 
bacterial microbiota, termed dysbiosis””’. Patients with IBD, particularly 
those with Crohn’s disease, have alterations in the gut microbiota, with 
reduced diversity in major phyla, such as Firmicutes and Bacteroidetes, 
and increased numbers of Enterobacteriaceae~”’. A key unresolved issue 
is whether dysbiosis represents a primary or secondary predisposing 
factor for IBD, as it may be related to, or compounded by, other 
defects. Recent studies have indicated that dysbiosis is influenced both 
by the host genotype, such as the presence of NOD2- or ATGI6L1- 
susceptibility alleles”’, and by IBD phenotype, with patients with 
ileal Crohn’s disease showing the most pronounced changes”. It is 
interesting that core commensals belonging to the Clostridiales order, 
such as Faecalibacterium and Roseburia, were significantly reduced in 
patients with ileal Crohn’s disease’. These genera are potent sources 
of short-chain fatty acids, such as butyrate, that have been shown to 
have protective effects in mouse colitis models”. In addition, clostridial 
groups IV (which includes Faecalibacterium) and XIVa were recently 
shown to promote the accumulation of FOXP3" T,,, cells in the mouse 
colon”. Dietary factors may also affect microbiota composition, leading 
to alterations in intestinal immune homeostasis”. 

Taken together, these studies are beginning to illuminate some 
of the complex interactions between different host genetic and 
environmental factors that can predispose patients to the development 
of IBD. The induction and perpetuation of chronic intestinal pathology 
may require additive lesions that affect several layers of immune 
regulation in the gut (Fig. 3). Early animal model studies showed that 
an interaction between the intestinal flora and host factors is required 
for the development of intestinal inflammation. For example, colitis in 
IL-10-deficient mice requires the presence of triggering bacteria such 
as H. hepaticus infection”. Two recent studies further illustrate how 
multiple lesions may interact to elicit intestinal pathology. The first of 
these found that the Paneth cell abnormalities present in mice carrying 
a hypomorphic mutant of the Atg1611 autophagy gene (Aig16l'™) were 
triggered by persistent infection with an enteric mouse norovirus strain 
(MNV CR6)”. In addition, Atg16/™ mice infected with MNV CR6 
developed exacerbated pathology after the administration of DSS, 
with characteristics of human Crohn's disease, including blunting of 
villi in the ileum, that depended on IFN-y, TNF-a and the presence of 
commensal bacteria”. Thus, Crohn’s-disease-like pathology required 
persistent viral infection of a genetically susceptible host together with 
environmental factors and commensal bacteria. The second study 
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examined a model of transmissible ulcerative-colitis-like disease that 
arises in T-bet’Rag2“ (also known as Tbx21~’ Rag2) mice and is 
associated with defective colonic barrier function and hyperactivation 
of inflammatory dendritic cells’. Transmission of a milder degree of 
colitis to co-housed wild-type mice correlated with the presence of the 
Enterobacteriaceae species Proteus mirabilis and Klebsiella pneumoniae, 
but also required the presence of an endogenous microbiota’. Thus, 
maximal colitis involved barrier defects, hyperactivated innate 
immunity, an absence of T,,.. cells and alterations in the microbiota 
composition. 


Perspectives 

Recent advances in mapping the genetic basis of disease susceptibility, 
coupled with rapid improvements in characterization of the microbiota 
in healthy and diseased individuals, offer great hope for the continued 
development of new IBD treatments. However, several key issues 
need to be understood better. These include distinguishing between 
individuality in IBD aetiology and commonality in pathogenic effector 
modules, so that therapies may be tailored to appropriate patient 
subgroups, such that distinct responses may be either suppressed or 
enhanced to restore homeostasis. The influence of microbiota-derived 
molecules on local and systemic immune responses is an area of great 
promise, but it will also be important to determine how immune 
responses feed back into shaping the composition of the microbiota 
and how different members of the microbiota interact within different 
environments in the gut, as well as to determine how to stably manipulate 
the gut microbiota. The real extent of effector T-cell plasticity in vivo 
and whether innate effector responses are similarly malleable needs 
to be investigated further, to establish whether the stable conversion 
of deleterious responses into beneficial ones may be achieved. 
Accomplishing these goals will require the cooperation of scientists 
working across several disciplines, an improved characterization of the 
pathophysiology of disease models and application of new technical 
approaches to clinical samples from patients with IBD. = 
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Genetics and pathogenesis of 
inflammatory bowel disease 
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Recent advances have provided substantial insight into the maintenance of mucosal immunity and the pathogenesis 
of inflammatory bowel disease. Cellular programs responsible for intestinal homeostasis use diverse intracellular and 
intercellular networks to promote immune tolerance, inflammation or epithelial restitution. Complex interfaces integrate 
local host and microbial signals to activate appropriate effector programs selectively and even drive plasticity between 
these programs. In addition, genetic studies and mouse models have emphasized the role of genetic predispositions and 
how they affect interactions with microbial and environmental factors, leading to pro-colitogenic perturbations of the 


host-commensal relationship. 


inflammatory disorders Crohns disease and ulcerative colitis. Family 

history is a risk factor for developing IBD, with a peak incidence in 
early adult life, although individuals of any age can be affected. IBD is 
thought to result from an inappropriate and continuing inflammatory 
response to commensal microbes in a genetically susceptible host. 
Recent progress in understanding IBD pathobiology offers insight 
into relevant disease mechanisms in mucosal immunity, including 
how genetic factors interact with microbial and environmental cues 
within tissue-specific contexts, the biological checkpoints involved, the 
selective decisions made during the course of disease and how plasticity 
of the biological response results in the capacity for different phenotypes. 

Ulcerative colitis is characterized by inflammation that is limited to 
the colon: it begins in the rectum, spreads proximally in a continuous 
fashion and frequently involves the periappendiceal region. By 
contrast, Crohn’s disease involves any part of the gastrointestinal tract 
— most commonly the terminal ileum or the perianal region — ina 
non-continuous fashion and, unlike ulcerative colitis, is commonly 
associated with complications such as strictures, abscesses and fistulas. 
Histologically, ulcerative colitis shows superficial inflammatory 
changes limited to the mucosa and submucosa with cryptitis and crypt 
abscesses. The microscopic features of Crohn's disease include thickened 
submucosa, transmural inflammation, fissuring ulceration and non- 
caseating granulomas. 

Among complex diseases, genome-wide association studies 
(GWAS) have been successful in IBD, identifying 99 non-overlapping 
genetic risk loci, including 28 that are shared between Crohn's disease 
and ulcerative colitis'* (Fig. 1). The genes implicated in childhood- 
onset and adult-onset IBD overlap, suggesting similar contributory 
genetic predispositions and pathophysiological pathways. Adding to 
the complexity of understanding disease mechanisms, a susceptibility 
allele often requires other genetic and non-genetic cues to manifest 
disease. The concordance rate in monozygotic twins of 10-15% in 
ulcerative colitis compared with 30-35% in Crohn's disease suggests 
that non-genetic factors may have an even more important role in 
ulcerative colitis than in Crohn’s disease’*. Furthermore, the higher 
penetrance of common Crohn’s-disease-associated polymorphisms 
in genetic case-control studies than in population-based studies of 


[intimate bowel disease (IBD) comprises the chronic relapsing 


cohorts of the same ethnicity is probably due to the concomitant 
aggregation of both genetic and environmental factors in the case- 
control studies*. Smoking is an example of a disease-specific modifier 
that seems to exacerbate Crohn’s disease while being protective 
against ulcerative colitis. Evidence suggests that smoking impairs 
autophagy, a process thought to be involved especially in Crohn’s 
disease, demonstrating how exposure to a disease modifier in a 
genetically predisposed individual may mechanistically affect IBD 
development’. 

In this Review, we provide an overview of genes and susceptibility 
loci implicated in IBD by GWAS and other genetic studies. Candidate 
genes are discussed in the context of IBD-relevant pathways, as well as 
how these molecular pathways interact with environmental factors to 
modulate intestinal homeostasis. 


Genes and pathways in IBD 

International collaborative research groups focusing on an unbiased 
appraisal of the human genome have been particularly successful in 
identifying genes and genetic loci that contribute to IBD susceptibility’. 
Despite distinct clinical features, approximately 30% of IBD-related 
genetic loci are shared between ulcerative colitis and Crohn’s disease, 
indicating that these diseases engage common pathways and may be 
part of a mechanistic continuum (Fig. 1). 

Analyses of the genes and genetic loci implicated in IBD show several 
pathways that are crucial for intestinal homeostasis, including barrier 
function, epithelial restitution, microbial defence, innate immune 
regulation, reactive oxygen species (ROS) generation, autophagy, 
regulation of adaptive immunity, endoplasmic reticulum (ER) stress and 
metabolic pathways associated with cellular homeostasis (Fig. 2). Early 
studies have suggested the existence of both protective and predisposing 
alleles®. Disease-relevant biological pathways are further highlighted 
when several components are implicated as risk factors together (Fig. 3). 

Multidisease comparative analysis can uncover common disease- 
causing genes and pathways. More than 50% of IBD susceptibility loci 
have also been associated with other inflammatory and autoimmune 
diseases. These overlapping genes can have contrasting effects in 
different diseases. For example, the same coding variant of PTPN22 
(R620W) is a strong risk factor for type 1 diabetes and rheumatoid 
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arthritis, but is protective against Crohn’s disease’. These data suggest 
that crucial clues to disease biology may reside in understanding the 
function of these shared genes. Several loci containing genes such as 
MST1, IL2, CARD9 and REL are shared between ulcerative colitis and 
the associated complication primary sclerosing cholangitis (PSC)*. 
This overlap may help to identify subsets of patients with ulcerative 
colitis who are at risk of PSC. Risk loci for Crohn’s disease present an 
unexpected overlap with susceptibility regions for Mycobacterium leprae 
infection, including genes such as NOD2, C13orf31 and LRRK2 (ref. 9). 
Although absent from the leprosy GWAS, other Crohn’s-disease- 
associated genes are also implicated in host responses to mycobacterial 
infection, including CARD9, LTA, ITLN1 and IRGM ' Thus, studies 
to delineate immune responses to antigens from, and infection by, 
mycobacteria, or other microbes that elicit similar host cell responses, 
may also be pertinent to Crohn's disease. 

Genetic variants associated with IBD can vary in frequency 
depending on the cohort ethnicity, raising the possibility that some 
such variants may have emerged in the context of historical selective 
pressures. Although this notion remains to be demonstrated in IBD, 
lessons from other autoimmune and infectious contexts lend support. 
For example, variants of apolipoprotein L1 and the inhibitory Fc 
receptor FcyRIIb that confer protection against trypanosomiasis and 
malaria, respectively, are more common in populations endemically 
exposed to these pathogens, but these variants also confer increased 
susceptibility to focal segmental glomerulosclerosis and systemic lupus 
erythematosus (SLE), respectively". 

Current GWAS are typically powered to characterize variants of 
>1% frequency and do not include the contributions from rare variants 
(<1% frequency). Exome sequencing can be useful for identifying rare 
variants, whereas whole-genome sequencing is of value in elucidating 
modifier loci. If pedigrees are available, rare variant discovery can 
be further targeted by fine mapping, as shown by the identification 
of IL10RA polymorphisms associated with the development of 
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Figure 1 | Genetic architecture of IBD-linked 
susceptibility loci. a, GWAS have identified 

71 risk loci in Crohn's disease and 47 risk loci in 
ulcerative colitis (Pvalue of association < 5x10"). 
Of these, 28 risk loci exhibit shared associations 
(defined as P< 5x10°* for either Crohn’s disease 

or ulcerative colitis, and P< 1x10 for the other 
form of IBD). Approximately half of the loci 
implicated in Crohn's disease and ulcerative 
colitis are associated with cis- and/or trans- 
expression quantitative trait loci (eQTL) effects 
(left panels). Genes whose expression are affected 
by these variants could also be involved in IBD 
pathogenesis. The loci composition (right panels) 
shows the number of genes that either lie within 
or segregate in linkage disequilibrium with IBD- 
implicated loci (coefficient of correlation 1° > 0.8). 
These loci are structurally heterogeneous, and are 
associated with widely ranging numbers of genes. 
Loci not associated with any genes, known as gene 
deserts, frequently contain non-coding transcripts 
or predicted open reading frames (ORFs), and can 
be associated with trans-eQTL effects. 

b, Recurring terms illustrating biological 
processes implicated by at least three genes 
represented in IBD loci; font sizes are proportional 
to the number of genes associated with each 
respective process. B,.. cells, B regulatory cells; 
ER, endoplasmic reticulum; GPCR, G-protein- 
coupled receptor; IL, interleukin; lincRNA, 

large intervening non-coding RNA; miRNA, 
microRNA; ncRNA, non-coding RNA; NF-«B, 
nuclear factor-KB; ROS, reactive oxygen species; 
Ty17 cells, T helper 17 cells; T,,, cells, T regulatory 
cells. 
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early-onset IBD”. Other interleukin-10 receptor (IL-10R) signalling 
components have also been implicated by GWAS, including STAT3, 
TYK2, JAK2 and IL10 itself, in concordance with the notion that both 
rare and common variants may highlight the same pathway. Although 
these components can also function in other contexts — for example, 
the transcription factor STAT3 and the kinase proteins TY K2 and JAK2 
are involved in the signalling of the interleukins IL-6, IL-22 and IL-23 
— these results illustrate the value of genetic studies in determining 
not just single genes, but also disease-relevant pathways. Recent 
resequencing studies in IBD recovered both known and new variants 
of CARD9, NOD2 and IL23R, with independent effects on disease risk. 
The JL23R variants were protective, supporting previous findings of 
a common protective JL23R allele and illustrating how studies of rare 
variants can reinforce GWAS findings’. Furthermore, T helper 17 
(T,17) cells generated ex vivo from subjects with a variant IL23R allele 
(R381Q) show decreased production of the pro-inflammatory cytokine 
IL-17A in response to IL-23 stimulation, emphasizing the importance 
of IL-23-related pathways in human IBD”. 

Early functional studies attempting to determine causality have 
largely focused on coding variants, although non-coding single 
nucleotide polymorphisms (SNPs) can be associated with qualitative 
and quantitative changes. Alternative splicing exemplifies a qualitative 
change affected by non-coding modifications. In the context of 
regulating immune responses, IL23R and NOD2 can encode truncated 
variants that inhibit their signalling pathways'*”. Furthermore, 
genetic changes may affect transcription-factor-binding sequences, 
locus accessibility, translational efficiency and trans-regulators such 
as non-coding RNAs and microRNAs (miRNAs). In this regard, a 
Crohn’s-disease-associated synonymous variant in IRGM (c.313C>T) 
perturbs regulation by miR-196A and miR-196B, and is associated 
with altered IRGM expression in patients with Crohn’s disease 
who bear this SNP"®. Cis- or trans-expression quantitative trait loci 
(eQTL) are detected for approximately half of the IBD risk regions, 
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Figure 2 | A model for IBD pathways based on GWAS. Intestinal 
homeostasis involves the coordinated actions of epithelial, innate and adaptive 
immune cells. Barrier permeability permits microbial incursion, which is 
detected by the innate immune system, which then orchestrates appropriate 
tolerogenic, inflammatory and restitutive responses in part by releasing 
extracellular mediators that recruit other cellular components, including 
adaptive immune cells. Genetic variants, the microbiota and immune factors 


indicating that allele-specific gene-expression changes contribute to 
disease risk (Fig. 1). Furthermore, IBD-implicated loci contain more 
than 10 miRNA-encoding sequences and 39 large intervening non- 
coding RNAs (lincRNAs), 5 of which interacted with the histone 
methyltransferase polycomb repressive complex 2 (PRC2), supporting 
the notion that regulation of gene expression by miRNAs and lincRNAs 
may be mechanistically relevant in IBD”. 

So far, GWAS account for 23% and 16% of the heritability in Crohn's 
disease and ulcerative colitis, respectively’*. Although these may be 
underestimates owing to the net effect of common variants that are 
individually too small to calculate accurately; the missing heritability 
may further comprise genetic, epigenetic and non-genetic (including 
environmental) components. Genetic factors such as rare variants, 
private mutations, structural variants and interactions between genes 
are not well captured by GWAS. Nevertheless, a key success of GWAS 
in IBD has been the ability to provide insight into disease pathobiology 
by highlighting key molecular pathways. 
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affect the balance of these signals. Genes in linkage disequilibrium (1° > 0.8) 
with IBD-associated single nucleotide polymorphisms (SNPs) were manually 
curated and classified according to their function(s) in the context of intestinal 
homeostasis and immunity. Text colour indicates whether the genes are linked 
to risk loci associated with Crohn’s disease (CD; black), ulcerative colitis (UC; 
blue) or both (red). Asterisk denotes corresponding coding mutations; cis- 
eQTL effects are underlined. G, goblet cell; P, Paneth cell. 


Epithelial encounters and pathogenicity 

The intestinal mucosa exists in a functional equilibrium with the 
complex luminal milieu, which is dominated by a spectrum of microbial 
species and their products. Maintaining this functional balance is 
central to preserving normal mucosal physiology, with perturbations 
contributing to the pathophysiology of many gastrointestinal disorders, 
including IBD. In addition to nutrient absorption, intestinal epithelial 
cells (IECs) perform both barrier and signal-transduction functions, 
with the capacity to sense luminal contents through surface receptors 
and, in return, secrete regulatory products that can orchestrate an 
appropriate response in the underlying lamina propria. 

Molecular details of the epithelial barrier and the structure of tight 
junctions, which are crucial to its integrity, have been characterized. 
Abnormal intestinal permeability has been observed in IBD patients 
and in some of their first-degree relatives. Genes within several 
IBD-associated loci indicate a role for barrier integrity in disease 
predisposition, implicating candidate genes such as CDH1, GNA12 and 
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Figure 3 | Genetic variants in IBD signalling modules. Schematic of 
selected signalling pathways involved in the maintenance of intestinal 
homeostasis, including epithelial junctional complex assembly, innate immune 
recognition of pathogen-associated motifs, GPCRs and immune defence, 
anti-inflammatory interleukin-10 (IL-10) signalling, T,,17-cell differentiation, 
inhibitory pathways in lymphocyte signalling, and B-cell activation and 

IgA antibody responses. Proteins encoded by genes identified as being in 
linkage disequilibrium with IBD-risk SNPs (7° > 0.8) are highlighted in red. 


PTPN2. Genetic studies have shown that truncated forms of the adherens 
junction protein E-cadherin (encoded by CDH1) are associated with 
Crohn's disease, and intestinal biopsies from patients with Crohn’s disease 
carrying these mutant alleles show inappropriate protein localization and 
cytosolic accumulation”. Activation of the G protein Ga, (encoded by 
GNA 12) leads to phosphorylation of the tight junction proteins ZO-1 
and ZO-2, resulting in destabilization of cell junctions in epithelial cell 
lines”. In vitro studies show that the protein tyrosine phosphatase family 
member PTPN2 protects against interferon-y (IFN-y)-induced epithelial 
permeability; concordantly, Ptpn2-deficient mice show increased 
susceptibility to experimental colitis”. 

Genetic studies have associated IBD with several transcription factors 
involved in epithelial regeneration, such as HNF4A and NKX2-3, which 
control crypt cell proliferation and IEC differentiation, respectively” ™*. 
Spontaneous colitis did not occur in all animal models with IEC-specific 
deletion of Hnf4a, suggesting that further environmental triggers are 
required for disease””’. STAT3, the gene encoding which lies within an 
IBD-implicated locus, is activated in epithelial cells from patients with 
IBD, and IEC-specific Stat3 deletion affects epithelial repair”. 

The intestinal barrier is enhanced by the presence of a pre-epithelial 
layer formed primarily of mucus glycoproteins, trefoil peptides, IgA 
and antimicrobial peptides (AMPs). Goblet cells generate the mucus 
layer, a protective polysaccharide bilayer rich in cationic proteins, 
the inner layer of which is essentially devoid of microbes. Patients 
with IBD frequently have a compromised mucus layer and increased 
mucolytic bacteria; mucus layer defects are also observed in Muc2~ 
and IEC-specific Clgalt1~” mice, which develop spontaneous colitis”. 
Interestingly, some patients with ulcerative colitis show defective 
intestinal O-glycosylation resembling that seen in Clgalt1~” mice”. 
Paneth cells are located in the crypts of the small intestine. In addition 
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to the role of these cells in crypt homeostasis and maintenance of the 
intestinal stem-cell niche, they also secrete antimicrobial effectors 
that prevent microbial invasion and control the composition of the 
gut microflora. These effectors include lysozyme, ReglIIy, secreted 
phospholipase A, (which degrades bacterial membrane phospholipids) 
and defensins HD5 and HD6 (pore-forming hydrophobic peptides that 
can integrate into bacterial membranes, resulting in lysis). Production 
of AMPs is regulated by Toll-like receptor (TLR) and NOD2 signals 
triggered by commensal flora. Paneth cell defects and susceptibility 
to intestinal inflammation have been uncovered in mice deficient in 
several Crohn’s-disease-associated genes, including Nod2, Atg16l1 
and Xbp1 (refs 27-29). These results highlight pathways important to 
Paneth cell biology, such as the regulation of AMP production (Nod2), 
granule exocytosis (Atg1611) and the ER stress response (Xbp1). Similar 
phenotypes have been observed in human disease, such that patients 
with Crohn’s disease carrying the ATGI6L1 (T300A) mutation show 
Paneth cell granule abnormalities. These findings suggest that defects 
in Paneth cell biology may define a subset of patients with Crohn’s 
disease. 

Cells with high synthetic capacity and secretory activity, such as 
Paneth cells and goblet cells, have high baseline levels of ER stress, 
leading to activation of the unfolded protein response (UPR), which 
controls cellular programs that allow proper protein processing. 
The UPR is mainly cytoprotective, although it can signal apoptosis 
after sustained ER stress. Increased intestine epithelial ER stress and 
susceptibility to colitis have been observed in mice with overactivation 
of, or perturbations in, the UPR pathway, including Muc2 missense 
mutation, Agr2! ~, Ern2 (also knownas Ire1b-), IEC- specific Xbp 1 ae 
and Mbtps1-hypomorphic mice”. Similarly, studies in primary 
IECs from patients with IBD show activated ER stress responses, 
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and hypomorphic variants of XBP1 have been associated with risk 
of IBD”. Overall, these results indicate that genetic variants that 
perturb mechanisms that protect against ER stress can affect intestinal 
homeostasis in IBD. In addition to its effects on cell viability, ER stress 
also activates autophagy and IL-23 release, suggesting that sustained 
ER stress may engage inflammatory circuits that are subsequently 
propagated by T cells”. 

In addition to limiting bacterial translocation across the mucosal 
barrier, IECs promote intestinal homeostasis by regulating innate 
and adaptive immune responses. Illustrating this point, IECs produce 
intestinal alkaline phosphatase, which can mediate lipopolysaccharide 
detoxification. Resolvin-E1, which is generated in part through 
the action of epithelial cyclooxygenase-2, attenuates neutrophil 
transmigration and upregulates epithelial expression of intestinal 
alkaline phosphatase during the restitutive response, a process termed 
epithelial imprinting’. IECs can also modulate adaptive immune 
responses, driving the differentiation of anti-inflammatory T regulatory 
(Tg) cells by releasing the vitamin A metabolite retinoic acid and the 
cytokines thymic stromal lymphopoietin (TSLP) and transforming 
growth factor-B (TGF-B)*. Breakdown in such epithelial defence 
mechanisms could lead to pathological intestinal inflammation. 


Checkpoints in the innate immune response 

The physical barrier of the intestinal epithelium is complemented by 
a well-evolved mucosal innate immune system, which is populated 
by cells poised to defend against pathogenic incursions and curtail 
inflammatory responses to maintain a state of hyporesponsiveness to 
commensal bacteria. Dendritic cells, macrophages, innate lymphoid 
cells (ILCs) and neutrophils are crucial cellular components of the innate 
immune system during infection or inflammation. Supporting the 
notion that defective innate immune responses can lead to IBD, patients 
with innate immunodeficiencies such as chronic granulomatous disease 
and Hermansky-Pudlak syndrome, which is associated with defective 
responses to bacterial DNA motifs (CpG oligonucleotides) specifically 
in plasmacytoid dendritic cells, tend to develop IBD™. Similarly, patients 
with Crohn's disease have defective innate immune responses, including 
attenuated macrophage activity in vitro, and impaired neutrophil 
recruitment and exogenous Escherichia coli clearance in vivo”. 

Intestinal dendritic cells constitute a central interface for monitoring 
the environment and relaying signals to initiate appropriate adaptive 
immune responses”. Dendritic cell subsets are specialized and 
respond to endogenous and exogenous stimuli such as microbial 
motifs, fatty acids, oxidized lipids and vitamin D by selectively 
engaging pro-inflammatory, anti-inflammatory, epithelial restitutive 
or T-cell education programs, as well as inducing IgA production’*”®. 
For example, T,,,-cell differentiation can be promoted by tolerogenic 
dendritic cells induced by TSLP, TGF-B and retinoic acid, all of which 
are made by IECs and stromal cells; these dendritic cells express the 
integrin CD103 but not the chemokine receptor CX;CR, (ref. 33). By 
contrast, dendritic cells expressing E-cadherin are a pro-inflammatory 
subset that promotes T,,17-cell differentiation (see ref. 37 and page 298 
for further details). Bacterial flagellins can override dendritic cell 
tolerogenic programs by stimulating TLR5 and inducing the release 
of pro-inflammatory mediators from hyporesponsive lamina propria 
CD11c"®* dendritic cells, pointing to a broader role for flagellated 
bacteria in IBD®. This specific immunostimulatory role for TLR5 may 
be particularly relevant in IBD, as seroreactivity to the bacterial flagellin 
CBir1, observed in approximately 50% of patients with Crohn's disease, 
correlates with a complicated clinical course. 

Intestinal homeostasis is maintained in part by the actions of resident 
macrophages that have enhanced phagocytic and bactericidal activity 
and decreased production of pro-inflammatory cytokines. Specialized 
macrophage subsets are also involved; tumour-necrosis factor-a 
(TNF-a)-secreting and IL-1-secreting Ly6C™®" monocytes are 
recruited in the initial phase of microbial challenge or tissue 
injury, whereas reparative IL-10-secreting, TGF-B-secreting and 
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vascular-endothelial-growth-factor-secreting Ly6C’” monocytes are 
mobilized during the resolution phase of inflammation”. Neutrophils 
may also contribute to the resolution of inflammation, for example, by 
synthesizing anti-inflammatory mediators such as lipoxin A,. Studies 
showing impaired secretion of lipoxin A, in mucosal tissues from 
patients with ulcerative colitis support the relevance of such mechanisms 
in IBD. 

IL-22 is emerging as an important cytokine in epithelial homeostasis, 
showing protective activity in different models of colitis through its 
stimulatory effect on antimicrobial and reparative processes. Produced 
by several cell types, such as ILCs, lymphoid tissue induced (LTi) 
cells, T,,17 cells and yé T cells, most intestinal IL-22 at steady state is 
produced by ILCs expressing the transcription factor RORyt”’. Studies 
in patients with Crohn’s disease have shown decreased frequencies of 
IL-22-secreting ILCs in the lamina propria™. Together, these findings 
suggest a central role for ILCs (and other IL-22-producing cells) in 
regulating intestinal homeostasis, which remains to be characterized 
in IBD. 


NOD2 and IBD 

NOD2 was the first gene to be associated with IBD, and thereafter sev- 
eral genes that interact epistatically with NOD2 signalling were also 
implicated. NOD2 recognizes the peptidoglycan product muramyl 
dipeptide (MDP), which modulates both innate and adaptive immune 
responses” (Fig. 4). For example, MDP stimulation induces autophagy, 
which controls bacterial replication and antigen presentation, and 
acts on dendritic cells in conjunction with TLR ligands to promote 
T,17-cell differentiation”. NOD2 may also contribute to immune 
tolerance. These effects are impaired in cells from patients with the 
Crohn’s-disease-associated NOD2 mutation 3020insC. Furthermore, 
NOD2 can participate in distinct MDP-independent pathways such 
as regulation of the T-cell response and the type I IFN response to 
single-stranded RNA (ssRNA) stimulation, indicating that gut micro- 
bial ssRNAs may exist and have immunomodulatory properties”. 
The relative contributions of these cytosolic MDP-sensing pathways 
vary greatly between cell types (Fig. 4). Further studies are needed to 
uncover the effect of disease-associated NOD2 alleles in different cell- 
specific programs, and unravel the precise role(s) of NOD2 in IBD. 
Other families of innate immune receptors linked to intestinal inflam- 
mation and immunity include NOD-like receptors (NLRs) and RIG-I- 
like receptors (RLRs). These receptors recognize microbial motifs or 
damage-associated molecular patterns and can activate the inflam- 
masome, thus appropriate regulation of these pathways is required 
for intestinal homeostasis. For example, mouse knockout studies of 
Nirp3 or RIG-I (also known as Ddx58) show increased susceptibility 
to experimental colitis’. Conversely, sustained overactivation of NLRs 
can also have detrimental effects, as illustrated by activating mutations 
in NOD2 and NLRP3 giving rise to Blau syndrome and cryopyrino- 
pathies, respectively. 


CARD9 and IBD 

CARD9 is an IBD-implicated adaptor protein that integrates signals 
from many innate immune receptors that recognize viral, bacterial and 
fungal motifs. Depending on the stimulus, CARD9 interacts with dis- 
tinct signalling complexes and activates different pathways to modulate 
cytokine environments appropriately”. In particular, recognition of 
fungal motifs in human dendritic cells leading to CARD9 and dectin-1 
signalling results in the broad activation of members of the nuclear 
factor-«B (NF-KB) transcription factor family, whereas CARD9 and 
dectin-2 signalling selectively activates the IBD-implicated NF-«B fac- 
tor REL, enhancing the production of T,17-polarizing cytokines such as 
IL-1 and the IL-23 p19 subunit®. Defective CARD9 function leads to 
the immune disorder mucocutaneous candidiasis, at least in part owing 
to failure to promote an adequate T,,17 immune response. These data 
illustrate how innate immune signalling molecules, including NOD2 
and CARD%, can act as central hubs to integrate diverse signals and 
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Figure 4 | Cell-intrinsic functions of NOD2. NOD2 is activated by the 
bacterial peptidoglycan muramyl dipeptide (MDP). Cell-specific NOD2 
functions are shown, distinguishing between those functions impaired 

in cells from humans with the Crohn’s-disease-associated mutation 
3020insC (red), from Nod2-deficient mice (blue), or from both (black). a, 
In Paneth cells, Nod2 deficiency leads to attenuated antibacterial activity 
in the intestinal crypts and decreased expression of a-defensin 4 (encoded 
by Defcr4, also known as Defa4) and a-defensin-related sequence 10 
(DEFCR-RS10, also known as DEFA-RS10). b, MDP-stimulated release of 
pro-inflammatory NF-«B-dependent cytokines (such as IL-1B, TNF-a and 
IL-6), as well as secretion of IL-23 (which promotes T,,17 differentiation) 
after co-stimulation with MDP and TLR2 ligands, is decreased in antigen- 
presenting cells from Nod2-deficient mice or 3020insC human donors. 
MDP stimulation also leads to NOD2-activated autophagy and antigen 
presentation. In mice, the activation of antigen-presenting cells by ssRNA 
or respiratory syncytial virus (RSV) stimulates secretion of type I interferon 
(IFN-f) in a NOD2-dependent, receptor-interacting protein-2 (RIP2)- 
independent fashion. In contrast to the pro-inflammatory effects, chronic 
NOD2 activation (right) by MDP induces both self-tolerance and cross- 
tolerance to IL-18, and TLR2 and TLR4 ligands. This is dependent on 
IRF4 in mice and humans, and also on IL-10, TGF-f, IL-1RA and IL-1R- 
associated kinase M (IRAK-M) in humans. MDP-induced tolerance is lost 
in Nod2-deficient mice and in patients with the 3020insC variant. NOD2- 
dependent release of IL-10 after MDP stimulation has been demonstrated 
to be specific to humans and is impaired in 3020insC cells. MDP-stimulated 
release of several cytokines, including IL-10, IL-1B, TNF-a and IL-6, is 
dependent on RIP2. c, In mice, NOD2 mediates IFN-y secretion and REL- 
dependent IL-2 production in T cells in response to Toxoplasma gondii 
infection. Also, Nod2 deficiency attenuates the ability of T cells to cause 
experimental colitis after transfer into Rag1-deficient hosts. 
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selectively activate specific effector pathways; in the polymicrobial 
context of the gut, it seems reasonable that defects at such nodal points 
would constitute key predispositions to IBD. 


Redox equilibrium in IBD 

The reduction and oxidation (redox) state of the gut depends on an 
equilibrium between oxidants, such as free radicals, ROS or reactive 
nitrogen species, and antioxidant mechanisms, such as the glutathione 
peroxidase (GPX) and glutathione S-transferase enzymes. This redox 
state affects many signal-transduction pathways, such as NF-«B signal- 
ling and AMP activity”. Supporting the importance of antioxidant path- 
ways in intestinal homeostasis, mice deficient in both Gpx1 and Gpx2 
develop spontaneous colitis. IBD genetic studies have implicated loci 
containing GPX1 and GPX4, further highlighting the relevance of these 
mechanisms in disease (Fig. 2). Among the oxidants, ROS represent an 
important class of effector molecules generated by mitochondrial and 
non-mitochondrial sources. ROS are non-toxic at basal levels and are 
even required to maintain the intestinal stem-cell niche. In the context 
of innate immunity, ROS have important antimicrobial activity, and 
contribute to intracellular signalling, promoting the production of pro- 
inflammatory cytokines. Furthermore, ROS generated by epithelial cells 
after infection can transmit signals to adjacent cells in a paracrine man- 
ner, allowing the local coordination of chemokine production’ 2 Genes 
within several IBD-associated loci may either regulate ROS production 
or protect against oxidative stress (Fig. 2). In particular, NOD2, CARD9 
and IFN-y-regulated leucine-rich repeat kinase 2 (LRRK2) all contrib- 
ute to ROS production***™. In addition to pro-inflammatory pathways, 
ROS are also involved in T,,.,-cell polarization and function”””*. Thus, 
understanding the role of disease variants will require a broader under- 
standing of the cell- and tissue-specific effects of ROS. 


Autophagy and IBD 
Genetic analyses have shown an unsuspected role for autophagy in innate 
immunity and IBD, implicating two component genes, ATGI6L1 and 
IRGM, in IBD pathogenesis” ~~’. Autophagy is involved in intracellular 
homeostasis, contributing to the degradation and recycling of cytosolic 
contents and organelles, as well as to resistance against infection and 
the removal of intracellular microbes (Fig. 5). ATG16L1 is essential for 
all forms of autophagy, and the coding mutation T300A is associated 
with increased risk of Crohn's disease. Despite ubiquitous expression of 
ATGI6L1, defects associated with ATG16L1 polymorphisms have so far 
been described only within the gut, probably owing to the high microbial 
load in this tissue. Subsequent evidence for MDP stimulation of NOD2- 
activated autophagy illustrates a link between genetic risk loci, and high- 
lights the importance of defining disease-associated pathways and the 
potential of new roles for known genes**””. Epithelial cells and dendritic 
cells containing Crohn’s-disease-associated ATG16L1 and NOD2 vari- 
ants show defects in antibacterial autophagy“**”. In dendritic cells, these 
defects are associated with an impaired ability to present exogenous anti- 
gens to CD4* T cells“. These results illustrate a close relationship between 
NOD2, ATG16L1 and autophagy, affecting intracellular processing and 
communication with the adaptive immune system, suggesting that 
genetic polymorphisms may affect both pathways concomitantly. 
Abnormalities consistent with Crohn's disease have been observed 
in mice with defects in autophagy, including hypomorphic Atg1611 
(Atg1ol1'"™) and IEC-specific Atg5-deficient mice”. Paneth cells either 
from Atg16l1™ mice or from patients with Crohn’s disease who have 
the ATG16L1 (T300A variant) allele show aberrant granule size, number 
and location, and reduced AMP secretion; notably, they also show gain 
of function, as evidenced by upregulated peroxisome proliferator- 
activated receptor signalling”. The landmark findings that gnotobiotic 
(germ-free) Atg1611°™ mice lost these Paneth cell anomalies and their 
sensitivity to dextrate sulphate sodium (DSS)-induced colitis, and that 
these abnormalities were restored by norovirus infection provide a 
definitive demonstration of how host-microbial interactions contribute 
to the pathophysiology of IBD”. 
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Effectors and regulators of adaptive immunity 
Homeostasis in the gut involves a balance between anti-inflammatory 
and pro-inflammatory signals, such that inflammatory disease results 
from an inadequate T,,,-cell response in the face of an overly exuberant 
response largely involving T};1 and T,17 cells in Crohn's disease and 
T,2 cells in ulcerative colitis. Intestinal inflammation resulting from a 
failure to maintain this balance is exemplified by patients with immune 
dysregulation, polyendocrinopathy, enteropathy, X-linked (IPEX) 
syndrome or with WAS (also known as WASP) deficiency, who have 
deficient T,,,-cell function. Furthermore, T,,-cell polarization in IBD 
is unlikely to be a simple divergence between a few disparate T-cell 
fates, but rather to include diverse sub-programs that can be selectively 
activated by antigen-presenting cells, cytokine milieu, microbial factors 
and metabolic programs. This notion is supported by the findings that 
T cells expressing both IFN-y and IL-17 are detected during all stages of 
Crohn's disease, and it evolves our understanding of pro-inflammatory 
and anti-inflammatory cell types and pathways. 

Recent studies indicate that T,., cells and T,,17 cells may arise from a 
common precursor, consistent with the observation that TGF-6 helps 
to direct differentiation of both subsets”. The generation of two subsets 
with opposing activities from a common precursor is reminiscent of 
differential responses of precursor cells along morphogen gradients 
during development, suggesting that similar morphogen gradients 
may work in the gut in vivo. In this regard, TGF-B alone drives T,,..- 
cell differentiation, with retinoic acid exerting a synergistic effect; 
the differentiation and function of mucosal T,,, cells depends on the 
transcription factors BLIMP 1 and IRF4 (ref. 63). Given the abundance 
of TGF-6 in intestinal tissues, this may contribute to baseline 
homeostasis, for example, by promoting T,,.-cell differentiation in 
naive lamina propria CD4* T cells. However, in conjunction with other 
signals, including cytokines, metabolites and microbial signals, T,,17- 
cell differentiation is promoted instead. Experiments demonstrating the 
crucial role of IL-23, IL-6 and IL-17 in the development of experimental 
colitis support a role for T,,17 cells in disease propagation™. Illustrating 
some of the intercellular interactions that affect the T,,17—T,,.,-cell axis, 
y6 T cells can drive the T,,17 program and contribute to experimental 
colitis, and are in turn suppressed by T,,. cells™®°, Furthermore, Tyeg cells 
can support the development of T,,17 cells by maintaining decreased 
levels of IL-2 in the local milieu”. 

Transcriptional programs helmed by the T,,,- and T,,17-cell-lineage- 
defining transcription factors FOXP3 and RORyt work together 
with a network of transcription factors, which can in turn respond 
to lineage-inducing cytokines and microbial factors. Transcription 
factors can mediate dichotomous functions depending on the cellular 
(and probably cytokine) context; for example, STAT3 drives T,,17- 
cell differentiation, but is also required for anti-inflammatory IL-10 
signalling through distinct pathways, inducing repressors such as 
strawberry notch homologue 2 (SBNO2). The aryl hydrocarbon 
receptor (AHR) is a nuclear receptor that is essential for IL-22 
production and also enhances IL-17 production, albeit to a lesser extent 
than RORyt-driven pathways, illustrating how distinct transcription 
factors may drive separate functions in the same cell®. AHR responds 
to polycyclic hydrocarbons, suggesting that xenobiotic stimuli may 
modulate IL-22 and IL-17 production. Many of the genes required for 
T, eg and T,;17-cell differentiation have been implicated in IBD (Fig. 3). 
CCR6, which lies in a locus associated with Crohn’s disease, encodes a 
chemokine receptor that is important for lymphocyte homing to the gut 
as well as for the development of intestinal lymphoid follicles — areas 
important for the production of T-cell-independent IgA, which affects 
the microbiota composition of the host®’. Thus, the gut may use TGF-B 
pathways to poise T cells to carry out both pro- and anti-inflammatory 
programs depending on the local presence of cytokines and microbial 
products. 

Illustrating the concept that hyper- or hypo-activation can affect the 
outcome of T-cell differentiation, defects in ITCH, a HECT-type E3 
ubiquitin ligase involved in T-cell activation, lead to impaired T,,.-cell 
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polarization in mice and to autoimmunity in patients”. IBD-associated 
loci also contain other members of the ITCH pathway, including 
NDFIP1 and TNFAIP3. Defects in these proteins are associated with 
inappropriate T-cell activation, skewed T-cell polarization and 
pathological intestinal inflammation, consistent with the hypothesis 
that NDFIP1 and TNFAIP3 are disease-contributory genes”. 

The soluble mediators secreted by T,,, cells are also released by other, 
FOXP3 regulatory T-cell subsets in the gut, such as T regulatory 1 (T,1) 
cells. IL-10 release can be induced by IL-27 in several subsets, including 
both pro-inflammatory T,,1 and anti-inflammatory T,1 subsets. IL-27 
is made by antigen-presenting cells, illustrating another homeostatic 
interaction between innate and adaptive immune cells’'. Interestingly, 
GWAS implicate both IL10 and IL27 in IBD, suggesting that this may 
represent a central axis of immune regulation in the context of IBD. 

In addition to the contribution of T-cell subsets, experimental 
evidence suggests that B-cell defects may contribute to the development 
of colitis in several ways, including impaired IgA production and 
antigen presentation, effects on early B-cell selection, and the perturbed 
production of pro- and anti-inflammatory mediators. Supporting the 
importance of IgA production in IBD, recent GWAS of selective IgA 
deficiency showed genes also implicated in IBD, namely ORMDL3, REL 
and PTPN22 (ref. 72). 

Immunoglobulins can also have immune modulatory activity, 
which is highlighted by IBD genetic studies. Peripheral B-cell toler- 
ance can be maintained by signalling through the lectin CD22, which 
binds immunoglobulin bearing «2,6-linked sialic acid”. This epitope 
is generated in part through the action of sialic acid acetylesterase 
(SIAE). Sequencing studies in small cohorts detected rare SLAE vari- 
ants that disrupted enzyme function and secretion in patients with 
IBD and other autoimmune diseases”. In addition, sialylated IgG may 
signal through DC-SIGN (dendritic-cell-specific ICAM3-grabbing 
non-integrin) on myeloid cells, leading to increased expression of 
inhibitory FcyRI (the gene encoding which lies in an ulcerative- 
colitis-implicated locus) on macrophages. These data demonstrate 
pathways by which immunoglobulins can exert anti-inflammatory 
activities and highlight components that genetic studies suggest may 
be perturbed in IBD. 

The functional relevance of anti-inflammatory B regulatory (B,,.) 
cells has been demonstrated in several mouse models of inflammatory 
diseases, including colitis; Big cells from patients with SLE also show 
impaired function”. B,.g cells differentiate after stimulation in the 
context of either anti-CD40 antibody or TLR ligands, and secrete anti- 
inflammatory cytokines such as TGF- and IL-10. Defects in B,,..-cell 
development or function might lead to failure to upregulate IL-10, 
leading to attenuated suppression of CD4* T-cell production of IFN-y 
and TNF-a, consistent with a broader role of B,,,, cells in autoimmunity 
and inflammatory disease. 

Other cell types that help to regulate gut immunity include intra- 
epithelial lymphocytes, which comprise many subsets such as CD8aa* 
y6 T cells (which show both cytoprotective and cytolytic activities) and 
CD8aa* af T cells (which are thought to be regulatory cells that require 
TGF- for development)”. Enrichment analysis of expression profiles 
further suggests that a subset of IBD-implicated genes is expressed 
in natural killer T (NKT) cells, which can detect infection though 
microbial lipids presented by CD1d or through atypical endogenous 
lipids, such as isoglobotriaosylceramide, which accumulate after 
microbial-TLR signalling”. Furthermore, NKT cells from patients with 
ulcerative colitis produce IL-13 and show enhanced cytotoxicity, further 
indicating that a perturbed NKT-cell response to as yet unidentified 
bacterial ligands may be pro-colitogenic. 

Genetic studies may offer insight into how this balance is disturbed, 
leading to pathological inflammation. Interestingly, perturbation by 
blocking cytotoxic T-lymphocyte antigen 4, an important inhibitory 
molecule expressed on activated T cells, commonly resulted in colitis 
in patients, again suggesting the poised state of activation of T cells in 
the gut”. 
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Mucosal ecology and immune responses in disease 

The gut microflora is a community that has co-evolved with the 
host and confers beneficial effects, including helping to metabolize 
nutrients, modulate immune responses and defend against pathogens. 
However, dysregulation of normal co-evolved homeostatic relationships 
between gut bacteria and host immune responses can lead to intestinal 
inflammation. Indeed, accumulating evidence suggests that luminal 
flora is a requisite, perhaps even a central factor in the development 
of IBD. 

Efforts to correlate changes in enteric microbial communities with 
disease are complicated by the great interindividual variation, such that 
even monozygotic twins may share only 40% of faecal phylotypes”. 
However, clustering the abundance of genes in certain categories in 
a species-independent fashion shows high interindividual similarity, 
suggesting that the microbiome can be perceived as a conserved 
functional entity”. Differences in the abundance of both bacterial 
species and functional gene categories (such as bacterial motility, sugar 
and iron metabolism) can differentiate patients with IBD from healthy 
individuals, demonstrating IBD-related changes in gut microbial 
ecology® (C. Huttenhower, personal communication). 

Even within an individual, intestinal microbial communities 
are dynamic and influenced by host factors, dietary effects and the 
microbes themselves. Many of the specific examples have emerged 
from studies in mice. Illustrating how microbial communities can be 
affected by host responses, infection-induced inflammation results in an 
oxidative metabolic shift in Salmonella enterica serovar Typhimurium 
(S. Typhimurium), which the bacterium uses in conjunction with 
host-derived ROS to create a growth advantage over fermenting 
bacteria®'. Microbe-host relationships are tightly interrelated, such 
that host factors can induce functional changes in the microflora that, 
in return, affect host biology. TLRs recognize microbial motifs and have 
a crucial role in determining mucosal susceptibility to injury and repair 
responses. Impaired TLR signalling due to Myd88 deficiency in non- 
obese diabetic (NOD) mice induces changes in microbiota community 
structure that protect against diabetes. Conversely, impaired innate 
immune function in T-bet’ Rag] (also known as Tbx21 Rag!) 
mice and Tir5“ mice leads to the generation of a pathogenic microbiota 
that causes colitis and metabolic syndrome, respectively, even in 
genetically normal hosts**™*. Similarly, the gut microbiota induces a 
dynamic IgA response, with qualitative and/or quantitative defects 
in IgA production resulting in impaired control of the microbial 
communities****. Deficiencies in activation-induced cytidine deaminase 
— an enzyme essential for somatic hypermutation and class-switching 
recombination during B-cell maturation — result in IgA deficiency, 
specific expansion of the anaerobic flora and segmented filamentous 
bacteria (SFB), and overstimulation of the mucosal immune system, 
with hyperplasia of mucosal lymphoid structures such as Peyer's patches 
and lymphoid follicles. Similar observations in mice with selectively 
impaired somatic hypermutation point to the importance of affinity 
maturation in generating diversity in the IgA repertoire to control the 
intestinal microbial burden”. 

Mice fostered on milk lacking sialyl(a2,3)lactose develop a distinct 
microbiota that confers transmissible resistance to DSS colitis, 
providing an example of dietary effects on the gut microbiota”. Dietary 
glycans can also be incorporated onto host cell membranes and can 
act as receptors for bacterial toxins. These findings demonstrate that 
host factors, both transient and genetic, can act together with dietary 
factors to modulate microbiota community structure and/or function, 
sometimes indelibly, in IBD-relevant ways. 

The intestinal mucosa can monitor microbial ligands using pattern 
recognition receptors (PRRs), and microbial metabolites using 
G-protein-coupled receptors (GPCRs) and solute carriers. Short-chain 
fatty acids (SCFAs), generated by some microflora constituents and 
decreased in ulcerative colitis, can signal through the receptor GPR43 in 
neutrophils, with notable proresolving effects on inflammation®. Other 
examples of GPCRs with immune modulatory activity include GPR120, 
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Figure 5 | Intracellular defence programs in microbial recognition. Host 
cells have evolved processes by which they restrict the availability of 
intracellular permissive niches to microbes. Microbial recognition by PRRs, 
such as NOD proteins and TLRs, activates key immediate host programs, 
leading to polarized secretion of pro-inflammatory mediators (directed to 
either the luminal or basolateral surface). Bacteria can either be maintained 
in subcellular compartments such as microbe-containing vacuoles, or 
escape into the cytoplasm, where they can be ubiquitylated and targeted for 
degradation. Both subsets can be targeted by the autophagy pathway, which is 
also regulated by other host defence mechanisms such as oxidative stress and 
inflammasome activation. 


GPR65 (also known as TDAG8) and GPR35, which can be activated by 
w-3 fatty acids, extracellular protons and kynurenic acid, respectively, 
with anti-inflammatory effects”. Of interest, kynurenic acid and other 
anti-inflammatory kynurenines are generated by the catabolism of 
tryptophan. Host levels of tryptophan are affected by the microbiota, 
suggesting how microbes can modulate the host immune response 
by metabolic effects. Other microbial metabolites can be transported 
into host cells. For example, the solute carrier SLC22A5 transports a 
quorum-sensing molecule from Bacillus subtilis, conferring resistance 
to oxidative stress in vitro”. The proton-coupled histidine/peptide 
cotransporter SLC15A4 is required for TLR7 and TLR9 signalling 
in plasmacytoid dendritic cells**. Furthermore, studies in Slc15a4~ 
dendritic cells suggest that SLC15A4 contributes to TLR9 signalling 
by regulating endosomal histidine levels, and to NOD1 signalling by 
cytosolic delivery of NOD 1 ligands such as the tripeptide motif L-Ala- 
y-p-Glu-meso-diaminopimelic acid (Trip, ). Slc15a4 deficiency 
ameliorates susceptibility to DSS colitis in mice”’. These findings and 
the presence of genes such as SLC22A5, GPR35 and GPR65 in IBD- 
risk loci suggest that PRRs, GPCRs and solute carriers help to maintain 
microbe-host relationships and intestinal homeostasis by transducing 
signals from microbial ligands and metabolites, which can in turn have 
immune modulatory effects on the host. 

Microbial signals also shape innate and adaptive immune responses. 
Germ-free animals have underdeveloped Peyer’s patches, as well as 
fewer IgA-producing plasma cells and lamina propria CD4" cells, 
illustrating the role of the microbiota in generating a mature mucosal 
adaptive immune response. The constituents of the microbiota can 
have important protective roles; for example, the impaired epithelial 
injury response in Myd88- deficient mice highlights the role of microbial 
stimulation in epithelial restitution”. Similarly, in mice, commensal 
bacteria activate expression of the transcription factor NFIL3, inhibit 


© 2011 Macmillan Publishers Limited. All rights reserved 


1/12b expression and protect against colitis, and CD14" lamina propria 
cells from patients with IBD express less NFIL3 than healthy controls”. 
The microbiota also acts on the epithelium together with adaptive 
immune signals, inducing epithelial secretion of IL-25, which represses 
ILC secretion of IL-22, and thus IL-22-induced AMPs. This equilibrium 
in the healthy mucosa is abrogated by epithelial insults, leading to 
increased IL-22 and activation of antimicrobial programs”. 

Microbial populations and ligands can have pro-inflammatory or 
anti-inflammatory effects. In mice, SFB promote T,,17 differentiation 
and IgA production, whereas Clostridium clusters IV and XIVa and 
parasite-secreted proteins such as Heligmosomoides polygyrus excretory- 
secretory antigen promote T,,,,-cell differentiation” *’. Interestingly, 
patients with IBD show reduced representation of Clostridium 
clusters IV and XIVa, indicating one way in which anti-inflammatory 
T,.g-cell effects might be diminished, leading to a predisposition to 
inflammation”. The common constituent of normal human microflora 
Bacteroides fragilis produces polysaccharide A, which suppresses IL-17 
production and promotes the activity of IL-10-producing CD4' T cells 
in mice”. The effects of microbial ligands and metabolites on adaptive 
immune function are exemplified by bacterial DNA signalling through 
TLR9 to limit T,,.-cell differentiation and promote intestinal immune 
responses to oral infection, and by bacterial ATP promoting T,,17 
differentiation. Thus, the microflora shapes development and function 
of the mucosal immune system in a tightly correlated manner. Immune 
stimulatory effects of the microbiota are important to promote an 
effective response against potential pathogens, although dysregulated 
interactions, which might arise from perturbations in host, microbial 
or environmental factors, could lead to a loss of tolerance and promote 
intestinal inflammation. 

Most of the observations detailing the mechanisms of microbe- 
host interactions have been made in mice, and correlations in humans 
remain to be defined. Microbes associated with human IBD include 
Faecalibacterium prausnitzii, adherent-invasive E. coli, invasive 
Fusobacterium nucleatum and mucolytic bacteria such as Ruminococcus 
gnavus and Ruminococcus torques. Reduced levels of F. prausnitzii in 
resected ileal mucosa from patients with Crohn's disease are associated 
with increased risk of endoscopic recurrence; F. prausnitzii stimulates 
IL-10 production in peripheral blood mononuclear cells, which may 
accountat least in part for this protective effect’*”. Recent studies suggest 
that adherent-invasive E. coli exploits host defects in phagocytosis and 
autophagy arising from Crohn’s-disease-related polymorphisms to 
promote chronic inflammation in the susceptible host’®. Patients with 
IBD have a compromised mucus layer and an epithelial surface that is 
densely coated with bacteria; the abundant presence of Ruminococcus 
strains in IBD mucosa raises the possibility that such microbes may 
contribute to the barrier defect observed in IBD, although whether their 
presence is causal or correlative remains unclear. 

These findings show that the composition of the microbiota and its 
interaction with the host are emerging as underappreciated sources of 
gene-environment interactions and are crucial to understanding the 
context of IBD. For example, alterations in the microflora community 
structure, as might occur in the context of antibiotic therapy or 
infectious colitis, can promote the development of IBD or trigger 
disease flares in patients with IBD. Identifying the factors that shape 
microbial community structure and function within an individual 
and that influence its restoration after perturbations will be key to 
understanding IBD pathogenesis. Obtaining such knowledge will 
require identifying associations between microbiome and human 
genetic studies at the very least. 


Future perspectives 

GWAS and next-generation sequencing technologies have provided 
insight into genetic definitions of host susceptibility. GWAS have 
unequivocally identified numerous genomic regions containing 
IBD-risk factors, showing several features of the genetic architecture 
of Crohn’s disease and ulcerative colitis. First, IBD risk involves 
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multigenic contributions, each with a relatively modest effect size. 
Second, genetic contributions to ulcerative colitis and Crohn's disease 
overlap, suggesting shared mechanistic features. Third, within 
Crohn's disease and ulcerative colitis, different clusters of risk loci are 
emerging, suggesting that these disease processes may comprise distinct 
pathological subsets beyond Crohn's disease versus ulcerative colitis. 
Accordingly, there is a need to define clinically relevant parameters 
that might help to classify Crohn’s disease and ulcerative colitis further, 
including early-onset disease, stricturing disease, slow progressors, 
frequency of flares and response to therapeutics. Furthermore, given 
the importance of environmental factors in IBD risk, studies aimed 
at defining contributory environmental factors are greatly needed. 
Relevant approaches might include establishing prospective inception 
cohorts or following healthy, high-risk individuals, such as those with 
an affected first-degree relative. 

An important adjunctive approach to GWAS is identifying rare 
variants, which frequently show larger effect sizes. The search for rare 
variants will help to prioritize the probable causal gene(s) within a 
locus (or loci) for experimental validation, identify disease-relevant 
pathways, and possibly identify domains important for protein 
function by leveraging natural mutations as a large forward genetic 
screen. Identifying and validating causal genes and assembling them 
into molecular pathways and cellular networks will require the use 
of patient samples and will considerably empower clinically relevant 
hypotheses. Given the diverse mechanisms that seem to participate in 
IBD, it will also become increasingly important to associate and stratify 
“-omic measurements of RNA, protein, small molecules, chemical DNA 
modifications and gut microbiota according to patient genotypes. 

There is a clear need to generate quantitative and qualitative 
expression maps of allelic variants. This notion is reinforced by the 
many polymorphisms implicating gene deserts, which probably 
contain regulatory elements. Furthermore, alternative splicing is a major 
contributor to the diversity of our transcriptome and its relevance to 
IBD has already been demonstrated by findings in IL23R. 

The gut has many tiers of defence against incursion by luminal 
microbes, including the epithelial barrier, and the innate and adaptive 
immune responses. These components are all tightly interrelated, and 
disease requires breakdown at several checkpoints. Generating models 
to systematically analyse the defects arising from genetic variants 
associated with IBD is crucial. However, these variants may show the 
disease-relevant defect under select conditions, such as high bacterial 
load found in the colon, and the accompanying cytokine milieu. 

Viral infections are common, and key studies highlight their 
potential to exert important immune modulatory effects. Acute and/ 
or chronic viral infections could interact with host-susceptibility 
factors in a manner that leaves either the cell or the cellular milieu 
poised to promote pathological intestinal inflammation after 
subsequent triggering events. Notably, these studies highlight the 
need to characterize all microbial constituents (viral, fungal, parasitic 
and bacterial) in the context of IBD. Other tools need to be developed 
to study the microbiota at the level of species, geographical location, 
genetic variations, transcriptional dynamics, as well as changes to 
proteins and metabolites. Indeed, bacterial metabolites are principal 
mediators of interactions between microbial species, as well as between 
microbe and host, as exemplified by SCFAs. Studies to identify bioactive 
metabolites and other small molecules may thus have diagnostic and 
therapeutic potential. 

An important goal is to combine these various facets to understand 
how genetic traits are integrated and propagated through physiological 
networks in the context of interactions with other genes, cells, microbes 
and environmental stimuli to control intestinal homeostasis. Genetic 
studies are already used to predict sensitivity to IBD therapies such 
as 6-mercaptopurine and may also be useful in predicting responses 
to biological therapies. Combining the different aspects of IBD 
pathophysiology may allow us to develop a more holistic understanding 
of the disease, thus promoting advances in diagnostics and therapy. = 
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Microenvironmental regulation of stem 
cells in intestinal homeostasis and cancer 


Jan Paul Medema! & Louis Vermeulen! 


The identification of intestinal stem cells as well as their malignant counterparts, colon cancer stem cells, has undergone 
rapid development in recent years. Under physiological conditions, intestinal homeostasis is a carefully balanced and efficient 
interplay between stem cells, their progeny and the microenvironment. These interactions regulate the astonishingly rapid 
renewal of the intestinal epithelial layer, which consequently puts us at serious risk of developing cancer. Here we highlight 
the microenvironment-derived signals that regulate stem-cell fate and epithelial differentiation. As our understanding of 
normal intestinal crypt homeostasis grows, these developments may point towards new insights into the origin of cancer and 


the maintenance and regulation of cancer stem cells. 


made the intestine one of the favourite tissues in which to study 

stem-cell regulation. The fact that almost all epithelial cells in the 
intestinal lining are replaced on a weekly basis puts great demands on 
the cellular organization of this tissue, and also puts it at serious risk of 
malignant conversion. Indeed, colorectal cancer (CRC) is one of the most 
common human cancers worldwide, with approximately 1.2 million new 
cases every year’. Homeostasis of the intestinal epithelium is maintained 
by an intestinal stem cell (ISC) compartment that resides at the bottom 
of the crypt, safely tucked away from the shear stresses and potentially 
toxic agents that pass through the intestinal tract (Box 1 Figure). These 
ISCs are at the top of a cellular hierarchy and are crucial for the renewal 
of the differentiated progeny within the intestinal layer. 

The intestinal renewal system is tightly controlled and depends on the 
spatial organization of signals that emanate from supportive mesenchy- 
mal cells, as well as from differentiated epithelial progeny. Intriguingly, 
recent evidence suggests that intestinal cancers may still contain a hier- 
archical organization, with cancer stem cells (CSCs) at the apex’. From 
the seminal work of Fearon and Vogelstein it is clear that CRC develops as 
a stepwise accumulation of genetic hits in specific genes and pathways’. 
The CSC theory refines this model further and suggests that the actual 
tumorigenic capacity of individual cancer cells may be influenced by 
homeostatic signals derived from their microenvironment’. These find- 
ings are especially exciting in the light of recent developments that have 
increased our comprehension of the regulatory mechanisms that control 
ISCs, and have resulted in new tools to identify and localize ISCs (see 
Box 1). Although we clearly do not fully grasp the complete spectrum 
of signals and interactions at this point, our understanding of normal 
crypt homeostasis and the identification of markers that define ISCs are 
providing intriguing insight into the organization of intestinal cancers. 

In this Review, we discuss the current ideas surrounding the identity 
of the ISC and the microenvironment-derived signals that regulate crypt 
homeostasis. In addition, we discuss the origin of cancer and the role of 
ISCs and CSCs, and present evidence that points to a distinctive role for the 
microenvironment in the onset of cancer and the maintenance of CSCs. 


r | The astounding renewal capacity of the intestinal epithelium’ has 


Stem-cell homeostasis and morphogenetic pathways 

Until relatively recently, ISCs were a rather elusive entity at the bot- 
tom of the intestinal crypt and could be identified only by indirect 
measurements. The discovery of ISC markers has partly changed this, 


but different markers point to distinct cells within the crypt®”®. In Box 1, 
we detail this debate, as well as the organization of the intestinal crypt 
and villi. In short, the marker LGRS points to the crypt base columnar 
cells located in between the Paneth cells at the crypt bottom’, whereas 
the markers BMI1 and TERT identify the +4 position in the crypt (Box 1 
Figure), just above the Paneth cells’”’. The existence and interdepend- 
ency of these different types of ISC remain a matter of debate (Boxes 1 
and 2). Almost 40 years ago, the unitarian theory proposed that crypts 
are monoclonal populations derived from a single ISC”’. More recent 
data, however, point to a model in which several ISCs within a crypt 
stochastically drift towards clonality'*’*. Whether this involves different 
ISC types is not known (see Box 2 for a discussion on neutral drift and 
monoclonality). Regardless of this dispute about ISC identity, there is a 
consensus that ISCs reside in a niche that provides the cells with essen- 
tial signals, with the morphogenetic pathways (WNT, Notch, bone mor- 
phogenetic proteins (BMPs) and Hedgehog) having a prominent role. 
Several excellent reviews cover these pathways'*"°, so here we mainly 
highlight the spatial regulation instigated by the epithelial cells and their 
microenvironment, while an overview of the morphogenetic signalling 
pathways is shown in Fig. 1. 

A wide range of evidence indicates that the WNT pathway (Fig. la) 
has a crucial role in intestinal proliferation and ISC maintenance. For 
instance, inactivating mutations in adenomatous polyposis coli (Apc) or 
activating mutations in b-catenin (Ctnnb1) — both key WNT signalling 
factors — drive intestinal hyperplasia’®. Similarly, overexpression of the 
WNT activator R-spondin-1 induces ISC expansion in vivo’’. Conversely, 
transgenic expression of the WNT inhibitor DKK1 (ref. 18) or deletion 
of the transcription factor TCF4 (ref. 19) results in a block in WNT 
signals and subsequent attrition of the epithelial layer. Although WNT 
proteins are expressed in a highly complex fashion by both epithelial and 
mesenchymal cells”, nuclear localization of B-catenin is only observed at 
the crypt bottom”. More recent data show that Paneth cells residing next 
to ISCs are one of the main sources of WNT3a, and they consequently 
spatially constrain ISCs to the crypt bottom”. Moreover, deletion of 
Paneth cells decreases the number of ISCs in the crypt”. The Paneth 
cell signals thus dictate the size of the stem-cell pool. Whether this is 
a dominant trait of Paneth cells — that is, whether progenitor cells can 
regain ISC characteristics when placed between or next to Paneth cells 
— is not known, but it is clear that an active WNT signal is crucial for 
ISC maintenance. 
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The intestinal epithelial layer consists of several differentiated cell 
types and is lined with mesenchymal cells** (see Box 1 Figure). The 
bottom of the small intestinal crypt contains Paneth cells and ISCs, 
whereas the remainder of the crypt is largely occupied by transit- 
amplifying cells, which are estimated to divide twice a day and are 
key to the rapid renewal of the epithelium”. At the top of the crypt, 
proliferation halts, and cells differentiate into either secretory (goblet, 
Paneth and enteroendocrine) cells or enterocytes. Although the colon 
lacks villi, the organization is roughly the same, except that Paneth cells 
are not present and differentiated cells occupy a large part of the crypt. 
The unitarian theory, proposed by Cheng and Leblond in 1974, 
suggests that all different intestinal cell types are derived from a single 
stem cell’. Using a sophisticated tracing technique, they defined the 
crypt base columnar cells (CBCCs), named after their appearance 
and location between the Paneth cells, to be the ISC’!. However, label- 
retention assays performed in the 1960s and optimized extensively 
since®? make a convincing plea that ISCs are not located between 
Paneth cells and are instead located just above the Paneth cells at the 
+4 position®’. This latter assay is based on the idea that ISCs retain 
a DNA label because of an immortal DNA strand that they hold onto 
after every division to maximize genome integrity®*. The identification 
of +4 cells as ISCs is supported by lineage-tracing experiments using 
a mouse strain expressing a Cre recombinase—oestrogen receptor 
fusion protein (Cre®’) within the Bmil locus’. This Bmi1-Cre™ mouse, 
when crossed with a lacZ reporter mouse, specifically activates the 
lacZ gene in the +4 position. More importantly, this irreversible genetic 
mark moves up and down the crypt in the days after Cre-mediated 
recombination to stain all differentiated cell types, suggesting that 
BMI1* +4 cells behave as ISCs®. This is further substantiated by the 
use of a transgenic mouse telomerase reverse transcriptase (Tert)- 
green fluorescent protein (GFP) construct that also marks cells at 
the +4 position’, as well as by lineage-tracing using Tert-driven Cre™® 
mice®. Although these data strongly argue for the +4 cell as an ISC, 
they are not entirely consistent, as the TERT-expressing cell has been 
shown to be mainly quiescent and radioresistant, whereas BMI1* 
and label-retaining cells divide every 24 hours and are radiosensitive. 
More importantly, using similar lineage-tracing techniques, another 
group came to completely different conclusions. On the basis of the 
idea that WNT is a crucial factor in ISC homeostasis, the researchers 
make use of specific WNT targets to mark stem cells in the crypt and 
find Lgr5 to be a reliable marker®. Knock-in constructs that allow 
expression of GFP and Cre™ from the Lgr5 locus show that LGR5 
expression is confined to CBCCs, and that these cells give rise to the 
variety of epithelial cells present in crypts, proving that CBCCs function 
as ISCs as well®. Further studies show several other specific markers 
for these cells, such as Olfm4 and Asc/2 (ref. 94). The transcription 


Localization and identification of intestinal stem cells 


factor ASCL2 is of particular importance because its deletion results in 
the complete loss of LGR5* ISCs, whereas transgenic Ascl2 expression 
induces crypt hyperplasia. Notably, BMI1 does not seem to have a 
significant role in LGR5* ISCs as Bmi1-knockout mice show normal 
CBCC morphology, maintain ASCL2 and OLFM4 expression and, 
above all, have a normal intestinal epithelium™. It therefore remains to 
be determined whether and how BMI1* +4 ISCs and LGR5” ISCs relate 
to each other. Interestingly, recent data indicate that TERT-expressing 
ISCs are quiescent, reside at the +4 location and can generate LGR5* 
ISCs®. Although this suggests that these different ISC types may act 

in a hierarchical fashion (see also Box 2), the observed generation of 
LGR5* cells from TERT* cells is not very efficient, and the model clearly 
contrasts with observations that claim that the LGR5* ISCs show the 
highest telomerase activity in the crypt”. 


(@@ Enterocyte absorptive cell ors 
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@) Goblet cell 
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Colon crypt 


Small intestine crypt-villus 


Box 1 Figure | The organization of the colon crypt and the small 
intestinal crypt-villus. Both the colon crypt (left) and the small 
intestinal crypt (right) contain a stem-cell compartment at the crypt 
bottom. CBCCs and the +4 stem cell have been indicated to be present 
between and just above the Paneth cells, respectively. Of note, Paneth cells 
are not detected in the colon, yet a Paneth-like cell has been suggested to 
be present at the crypt bottom. All four lineages (three in the colon) — 
enterocytes, Paneth cells, goblet cells and enteroendocrine cells — appear 
in different, but set ratios. Paneth cells move down to the bottom and are 
long-lived, whereas other lineages move up and are shed (a few days later) 
into the lumen while undergoing apoptosis. Rare cell types reported to 
exist in crypts, such as tuft cells, are not shown. 


Activation of the Notch pathway (Fig. 1b) normally depends on direct 
cell-to-cell contact, but secreted ligands exist. Notch interacts with 
ligands from the delta and jagged family and shows classical feedback 
inhibition”. Cells that receive signals through Notch downregulate 
their own Notch ligands and thus deprive their neighbouring cells of 
Notch-activating signals. This mechanism allows easy regulation of 
different cell-fate decisions for neighbouring cells. In the intestine, 
Notch activity determines lineage decisions between enterocyte and 
secretory cell differentiation. Inhibition of the Notch pathway results 
in a massive increase in goblet cells, whereas its activation results in 
goblet-cell depletion®**. However, Notch seems to have dual functions 
in the crypt, as it acts together with WNT to maintain the proliferative 
speed and deletion or activation of the pathway, which significantly 


affects crypt homeostasis” **. Similar to other developmental systems, 
data suggest that the ISC microenvironment delivers Notch-activating 
signals to maintain 'stemness'’, which is consistent with the observation 
that Paneth cells express Notch ligands”. However, it is important to 
note that it is unclear whether ISCs require Notch to retain stemness or 
whether Notch instead acts on transit-amplifying cells. Nevertheless, 
in general, the data strongly support a model in which Notch directs 
proliferation when WNT signal activity is high, and directs enterocyte 
differentiation when WNT activity levels drop towards the top of the 
crypt. 

BMP belongs to a more complex family of ligands comprising BMP 
and transforming growth factor-$ (TGF-) family members, which 
share intracellular signalling through the SMAD proteins” (Fig. 1c). 
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Intestinal crypts are monoclonal in origin, which concurs with the idea 
that they are organized in a strict hierarchical fashion. Monoclonality 

is achieved within 2 weeks of birth in a process called purification®. 
When a mutation is introduced, crypts go through a mosaic phase, 

but quickly become monoclonal again (reviewed in ref. 93). These 
observations suggest that a mutation in a particular cell can quickly 
spread throughout the crypt, which has implications for cancer. 
Moreover, the data suggest that a single ISC generates the complete 
crypt, giving rise to a monoclonal lineage. However, this model has been 
modified to introduce the concept of neutral drift®®’”, a term derived from 
population genetics. In a stem-cell context, this describes how several 
functionally identical ISCs generate a monoclonal crypt by assuming 
that at every division ISCs stochastically generate zero, one or two 
daughter ISCs. When zero daughter ISCs are formed, this specific clone 
is lost and quickly replaced by a neighbouring ISC, hence explaining 
how several ISCs can, by neutral drift alone, generate monoclonal 
crypts. Sophisticated genetic-tracing strategies using either Lgr5-driven 
‘confetti’ mice or Ah-Cre®® mice further verified this concept, leading to 
the conclusion that several ISCs within a crypt show a non-hierarchical 
organization that is subject to neutral drift!?"°. Importantly, the onset 

of cancer through APC mutations greatly influences crypt dynamics, 
because this tumorigenic stem-like cell is suggested to disregard neutral 
drift and possess the capacity to quickly repopulate the crypt with its 
own progeny, generating early aberrant crypt foci® (Fig. 2). 


Crypt monoclonality and stem-cell neutral drift 


Although the above-described model satisfactorily explains the 
mechanism by which monoclonality is attained, a few puzzling issues 
remain. The first is the observation that crypts are polyclonal at birth, 
before purification®®. Apparently, the neutral drift is not occurring 
prenatally. The second issue is the purification speed of the adult 
colon, which is faster than in the small intestinal crypts. Yet colon 
ISC proliferation, a crucial determinant for the pace of neutral drift, 
is reportedly slower®”. In the current, stochastic drift, model this can 
be explained only if the number of stem cells is significantly lower 
in the colon. An alternative explanation for these observations could 
be that a slow cycling or quiescent ISC sits on top of the hierarchical 
organization in adult crypts, similar to the organization of the 
haematopoietic stem-cell compartment!™. This ISC would be activated 
under extreme stress conditions such as irradiation. In agreement with 
such a model is the observation that TERT-expressing quiescent ISCs 
can give rise to LGR5* |SCs after radiation damage®. Importantly, this 
also sheds new light on crypt monoclonality, which could thus be a 
combination of neutral drift of LGR5* ISCs and repopulation by rare 
‘master’ ISCs. It may prove insightful if lineage-tracing experiments 
with Lgr5-Cre®® mice are followed by sublethal irradiation to kill 
the LGR5* ISCs. ISC and crypt regeneration through fission or by a 
previously quiescent ISC will show distinct dynamics and probably 
provide insight into the relationship between CBCCs, +4 ISCs and 
quiescent ISCs. 


BMP2 and BMP4 are expressed by mesenchymal cells in the intestine 
and are suggested to halt proliferation at the crypt-villus border, 
allowing differentiation. In agreement, Bmpr1a-deficient mice (which 
lack a BMP receptor)” or mice overexpressing the BMP inhibitor 
noggin” show hyperproliferation, as well as crypt fission. To coordinate 
the segregation between proliferation and differentiation, BMPs are 
active at the top of the crypt, where differentiation occurs. BMPs are 
also produced at the crypt bottom, but here they are kept in check by 
BMP inhibitors (such as noggin) that are specifically expressed by the 
mesenchyme in the ISC region. Notably, BMPs also have dual functions 
and are involved in lineage fate decisions towards secretory cells”. 
Through BMPs and noggin, the mesenchymal microenvironment thus 
secures a spatial organization in the crypt (Fig. 1c). 

The role of the Hedgehog pathway (Fig. 1d), which acts through 
the membrane proteins smoothened and patched”, in intestinal 
homeostasis is more confusing. Conditional deletion of patched, a 
negative regulator of the pathway, inhibits WNT in the small intestine 
and leads to premature enterocyte differentiation’. Physiologically, 
it seems that Indian hedgehog (IHH), which is mainly expressed by 
differentiated epithelial cells, signals to the mesenchyme, where it 
is thought to induce BMP secretion*’**. In contrast to IHH, the role 
of Sonic hedgehog (SHH) is less clearly defined. SHH is reported 
to be expressed in a single cell in the crypt at the +4 position, which 
demonstrates ISC features*’, but further studies are needed to define 
this cell and its function better. 

Together, the current data paint a picture in which the micro- 
environment of ISCs, composed of direct progeny, mesenchymal cells 
and probably extracellular matrix components, organizes into a complex 
interaction of morphogenetic signals that each has a crucial role in crypt 
maintenance (Fig. 1 and Box 1). This complexity explains why in vitro 
recapitulation of the crypt architecture has proven extremely difficult 
to achieve. However, the insight into crypt homeostasis has allowed 
two independent groups to devise culture methods that successfully 
generate crypt structures, which can be sustained for long periods of 
time in vitro’**’. Both culture systems require a solid matrix (collagen 
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or Matrigel) and are strongly potentiated by R-spondin-1, which 
enhances WNT signalling. In both systems, cells seem to organize into 
crypt-like structures containing ISCs and transit-amplifying cells, as 
well as differentiated cells. Interestingly, a completely different niche- 
dependency was observed in the two culture systems. Whereas the 
collagen cultures depend on mesenchymal myofibroblasts”, these are 
redundant in Matrigel”. It is important to note that the latter cultures 
require noggin’’, probably preventing BMP-driven differentiation of 
ISCs and transit-amplifying cells. Whether noggin or, alternatively, 
growth factors present in Matrigel replace the myofibroblasts remains 
to be established, but it is clear that these cultures have enormous 
potential for further study. For instance, as both BMI1* and LGRS‘ cells 
are present in the crypt cultures”, this should allow in vitro lineage- 
tracing and thereby provide an avenue to address the relationship 
between different ISCs in easily accessible systems. At this point, 
the Matrigel culture system has been used to show that even a single 
LGR5* ISC can expand and self-organize into a crypt-like structure”. 
This process is greatly enhanced in the presence of WNT-ligand- 
producing Paneth cells”, validating the idea that Paneth cells support 
ISCs, and simultaneously confirming that ISCs require input from 
a niche. Undoubtedly, these culture methods will in the near future 
provide us with more direct insight into the role of mesenchymal cells 
and the distinct morphogenetic pathways in ISC maintenance and 
differentiation. 


Intestinal cancer 

Despite stringent homeostatic maintenance in the intestine, the 
high number of patients with CRC indicates that these regulatory 
mechanisms often fall short in protecting against malignant 
transformation. Both environmental and genetic risk factors have been 
defined for CRC, and, not surprisingly, deregulation of morphogenetic 
pathways plays a key part in cancer development. Environmental 
factors include a Western diet and a history of inflammatory bowel 
disease*’, whereas a genetic component has been clearly defined by 
genome-wide association studies*”**. In addition, prominent genetic 


© 2011 Macmillan Publishers Limited. All rights reserved 


b Notch 


a WNT 


SMAD1, SMAD5 
and SMAD& 


Figure 1 | Crypt homeostasis. a, WNT ligands around the base of the 
intestinal crypt signal through a complex of frizzled (FZ) and low-density 
lipoprotein receptor-related protein (LRP) receptors. Canonical WNT 
regulates B-catenin localization through a destruction complex that contains 
APC, AXIN2, glycogen synthase kinase-36 (GSK36) and casein kinase 1 
(CK1), and directs the phosphorylation and subsequent degradation of 
B-catenin. In the presence of WNT ligands, this complex disassembles, 
allowing the accumulation and translocation of b-catenin to the nucleus, 
where it drives the transcription of WNT target genes aided by TCF and 
lymphocyte enhancer factor-1 transcription factors. b, Notch cooperates 
with WNT to drive proliferation, and is involved in lineage-fate decisions. 
This straightforward pathway is active in cell-to-cell contact. Delta and 
jagged ligands on the surface of one cell activate Notch receptors on a 
neighbouring cell. This is followed by two proteolytic events catalysed by a 
distintegrin and metalloprotease (ADAM) and y-secretase (GS) proteases, 
which release Notch intracellular domain (NICD). NICD translocates to 
the nucleus, where it drives the transcription of Notch gene targets aided 


predispositions are found in several familial CRC syndromes, such 
as familial adenomatous polyposis (FAP) and hereditary non- 
polyposis CRC (HNPCC or Lynch syndrome)”. Patients with 
FAP develop hundreds of colonic polyps early on in life, and their 
lifetime risk of developing CRC is almost 100%*°. The genetic defect 
that underlies this syndrome is a heterozygous mutation in the APC 
gene — a crucial negative regulator of the WNT pathway” (Fig. 1a). 
Notably, the vast majority of sporadic CRC cases also carry WNT 
pathway mutations (85% APC and 10% B-catenin), highlighting the 
importance of this pathway in CRC. On the basis of work by Fearon 
and Vogelstein, the prevailing model is that CRC develops owing to 
an accumulation of mutations, each defining a different step in the 
adenoma-carcinoma sequence’. In this sequence, the first hit that 
induces the transition from normal to polypoid tissue is seen in the 
WNT pathway, whereas progression to adenomas and carcinomas 
depends on, for example, activating mutations in the RAS pathway 
and inactivation of p53, SMAD4 and PTEN” (Fig. 2). Although this 
mutation-driven model is supported by a range of experimental data, 
several nuances need to be made to explain the current observations 


d Hedgehog 


eee! 


© Hedgehog 


REVIEW 


@ Enterocyte 
© Goblet cell 
Progenitor cell 
@ +4 cell 

@® cecc 

@) Paneth cell 


LF Myotibroblast 


Delta/jagged ! 
AAA 


by mediators such as recombining binding protein suppressor of hairless 

(R) and mastermind-like protein 1 (M).¢, BMP proteins, produced mainly 
by stromal cells, counteract proliferative WNT signals and thereby halt 
proliferation and drive differentiation. At the crypt bottom, BMPs are blocked 
by noggin, which specifically binds BMPs and prevents receptor interaction. 
Signalling by BMPs depends on the heterodimerization of the BMPR1 and 
BMPR2 receptors, leading to phosphorylation of SMAD1, SMAD5 and 
SMAD8. Phosphorylated (P) SMADs associate with SMAD4, translocate 

to the nucleus and drive the transcription of BMP target genes aided by 
RUNX2 and a cofactor (C). d, Hedgehog proteins relay signals between 
epithelial cells and the mesenchyme. Hedgehog has been shown to counteract 
WNT-driven epithelial proliferation, potentially through BMPs. The role 

of SHH in ISC biology remains unclear. Hedgehog signalling relies on the 
interaction between patched (PTCH) and smoothened (SMO). Patched 
represses smoothened, but is blocked when bound to Hedgehog. Derepressed 
smoothened activates GLI transcription factors (act-GLI), which translocate 
directly to the nucleus and drive the transcription of Hedgehog target genes. 


fully. First, tumorigenicity induced by mutations is proposed to be 
different when these are introduced in ISCs compared with transit- 
amplifying or differentiated cells, which has led to the idea that 
ISCs are the cell of origin in cancer. Second, microenvironmental 
influences on CRC need to be placed within this scheme of events, as 
it is now well established that environmental factors are an enabling 
characteristic promoting tumour initiation and growth”, and this is 
especially evident in CRC. Third, the existence of cell-to-cell variation 
in the grade of differentiation within CRC as proposed by the CSC 
hypothesis’ needs to be evaluated and placed within the context of the 
mutation-driven model. 


Stem cells and the origin of cancer 

As mentioned earlier, the sequence of events in CRC has been 
intensively studied using a variety of mouse models. The most 
frequently used model is the Apc” mouse, which was generated 
by a random mutagenesis screen”. Similar to patients with FAP, 
this mouse strain contains a heterozygous truncating mutation in 
the Apc gene, and develops dozens of polyps and small adenomas 
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Figure 2 | Stem cells and the environment in the adenoma-carcinoma 
sequence. a, Normal organization of the intestinal crypt. b, After the loss of 
wild-type APC or B-catenin mutation, the transformation of healthy crypts 
towards an adenoma starts. This is accompanied by several changes in crypt 
appearance and behaviour. First, cells show a more immature phenotype 
and a higher proliferative index (more red and pink cells with irregular 
shape). Second, crypt fission, a physiological process in which a crypt splits, 
is observed and results in expansion of the pre-malignant clone. Third, 

the attraction of myeloid cells (round light blue), producing factors such 

as interleukin-6 and tumour-necrosis factor-a (light blue dots) is detected 


in the intestine. Although this strain has been available for a long 
time, the cell of origin for cancer formation in this model of CRC 
has remained obscure. Both a bottom-up theory, in which ISCs are 
the cell of origin, and a top-down theory, in which a progenitor or 
differentiated cell is the first transformed cell, have been suggested. 
Evidence for the top-down model of CRC development relies almost 
completely on histopathological observations. By contrast, the 
bottom-up theory recently received strong genetic support, as ISC- 
specific deletion of both functional Apc alleles using Bmil-, CD133 
(also known as Prom1)- and Lgr5-Cre recombinase mice leads to the 
very rapid development of full adenomas”’*“’. Notably, in a parallel 
approach, deletion of functional Apc in short-lived progenitor or 
differentiated cells resulted in only sporadic and slow-developing 
adenomas”. Similar conclusions on the role of normal stem cells as the 
cell of origin of tumours have been generated using mouse models for 
chronic myeloid leukaemia™, prostate carcinoma” and glioblastoma”® 
(see ref. 47 for review). However, several important issues on the role 
of microenvironmental factors, the speed of tumour development 
and the true nature of human tumours need to be considered before 
translating this knowledge to human CRC. 


The adenoma-carcinoma sequence and polyclonality 

Although the models for ISC-driven tumorigenesis provide valuable 
insights, the speed at which adenomas form in these systems is much 
faster than in humans. The speed of tumour formation can be attributed 
to a simultaneous deletion of both Apc alleles, but this rapid generation 
of adenomas is unlikely to be a true mirror image of human disease. For 
example, in large human case studies, very early adenoma precursor 
lesions or aberrant crypt foci (ACF), which mostly occur through APC 
mutations, are much more abundant than adenomas”. This indicates 
that the transition from an APC mutant ACF stage to a full adenoma 
is a slow process, or that most ACF never progress to an adenoma. In 
agreement, studies on a unique mosaic patient with FAP show that most 
of this patient’s adenomas are polyclonal in nature, whereas the early 
lesions (monocryptal ACF) are all monoclonal. This suggests that 
the transition from ACF to adenomas may be a much more complex 
process than simple expansion. In this light, it is interesting to note 
that the deletion of Apc in mouse intestinal progenitor cells gives rise 
to micro-adenomas that remain present until later age” , in contrast 
to the idea that such clones would be pushed to the top of the villi and 
discarded into the gut lumen. In some cases, those lesions even progress 
to adenomas”. Arguably, this sequence of events provides a more 
realistic model for human disease, at least from a timing perspective. 
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and promotes carcinogenesis. Although normal myofibroblasts (blue) 

are still present, factors produced by premalignant and infiltrating cells 
activate myofibroblasts (orange). The myofibroblasts produce, among other 
factors, large amounts of hepatocyte growth factor (HGF), which promotes 
dedifferentiation (orange dots). c, The accumulation of other genetic lesions, 
in RAS and PTEN, for example, induces progression towards an invasive 
growing CRC. At this stage, stromal cells become an even more pronounced 
part of the tumour through the production of factors that further promote 
tumour progression (orange dots). Note the continued presence of several 
types of differentiated (tumour) cell throughout the whole sequence. 


APC mutations are carefully selected 

Despite the fact that the deletion of both Apc alleles results in rapid 
adenoma formation, WNT activity levels in human CRC seem to be 
strongly regulated during both initiation and progression. This is 
partly due to the genetic make-up of developing lesions and is partly 
induced by the microenvironment. For instance, loss of the normal 
APC allele in individuals with FAP is a non-random event, because 
the nature of germline APC mutation has been shown to influence the 
type of the second APChit™”". In essence, it seems that the shorter the 
germline-truncated APC protein is, the longer the somatic-mutated 
APC protein will be, and vice versa. As the length of truncated APC is 
directly linked to the ability to prime B-catenin for degradation”, this 
led to the ‘just right’ signalling model, in which only a specific range of 
WNT signalling activity levels (not too high, not too low) is associated 
with transformation of intestinal epithelial cells”. This observation 
suggests a careful balance in WNT activity during CRC initiation, and 
it is therefore not clear whether simultaneous deletion of both mouse 
alleles in ISCs is representative of human disease. 


Adenoma formation is influenced by the microenvironment 
The above-described ISC-driven models suggest an almost cell- 
autonomous induction of adenomas. However, in the Apc™™” 
mouse model, regulation by environmental factors is clearly 
observed. As an example, the administration of dextran sulphate 
sodium (DSS), which induces intestinal inflammation, results in a 
strong increase in polyp formation in Apc™ mice, indicating that 
microenvironmental factors have an important role”. Whether DSS- 
induced inflammation can also support adenoma formation when 
Apc is deleted in progenitor cells is unclear, but if this were the case 
it would suggest that the cell of origin in CRC depends on the model 
chosen. 

Although this cell-of-origin discussion seems at first sight to be 
an academic issue that will not directly impinge on patients with 
cancer, there is good reason to understand the history of CRCs. For 
instance, it may be crucial to understand whether different types of 
CRC (microsatellite instability, mucinous or neuroendocrine) develop 
owing to different genetic hits or from a distinct cell of origin. More 
importantly, with respect to prevention, knowing the cell of origin and 
understanding the signals that either maintain lesions as small ACF or 
allow them to progress to full adenomas and subsequently carcinomas 
is of vital importance. We can conclude that APC deletion in ISCs leads 
to the very rapid onset of adenomas”, indicating that ISCs are more 
prone to full transformation. This is probably due to their capacity 
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to self-renew, but caution is needed to translate these findings to 
spontaneous adenoma development, in which the microenvironment 
and other genetic hits are crucial determinants. 


Genetic and microenvironmental influences in CRC 

To gain more insight into the development of CRC and the genetic 
and environmental modulations of this process, several mouse models 
have been generated that extend the Apc™"" model of CRC. In most of 
these models, tumorigenesis occurs in the small intestine, whereas in 
humans the disease appears exclusively in the colon. The reason for 
this discrepancy is not well defined, but it is notable that humans rarely 
develop small intestinal malignancies, although homeostatic signals and 
renewal speed are largely identical between colon and small intestine. 
Besides a difference in location, Apc” mouse adenomas also rarely 
progress to full carcinomas™**’, which may be a timing issue because 
progression in humans is estimated to take approximately 2-10 
years — much longer than the lifespan of a mouse. To model human 
CRC better, mouse strains containing different types of truncated 
Apc gene, representing the variety of human APC mutations, have 
been generated, and these show subtle differences in the number and 
location of adenomas, indicating that the dosage of WNT signalling is 
crucial”. Notably, Apc’” mice with an extra mutation in the homeobox 
gene Cdx2, a regulator of colon epithelial cell differentiation, shift 
most polyps to the colon®’. However, progression to invading 
carcinomas in such mice is not observed. Therefore, the next steps 
in the adenoma-carcinoma sequence are modelled by introducing 
further mutations in addition to Apc. For example, Apc” mice that 
also contain an activating Ras’'’° mutation show more dysplasia and 
enhanced invasive growth”. Similar results have been obtained by 
deleting Pten on an Apc” background”. Together, these data support 
the notion that sequential accumulation of genetic mutations underlies 
CRC development, as proposed previously”. However, these studies also 
show that the microenvironment is a determining factor in colorectal 
tumorigenesis. The initial search for genetic modifiers of the Apc™ 
phenotype indicated that secreted phospholipase A,, an enzyme 
involved in inflammatory responses, potentiates hyperplasia”. 
Similarly, microenvironmental tumour control was deduced from the 
cis-Apc/Smad4 mouse®. The tumours that arise in this mouse strain 
carry an Apc mutation in combination with a lack of Smad4, which 
renders the mice unresponsive to differentiation-inducing BMP 
signals from the microenvironment. These mice have highly invasive 
adenocarcinomas, characterized by extensive stromal proliferation”. 
More recent findings indicate that this invasive phenotype depends on 
the recruitment of immature myeloid cells to the microenvironment. 
Interestingly, CRC progression in humans is also associated with a loss 
of BMP signalling through inactivation of either SMAD4 or BMPR2 
(ref. 64). The loss of microenvironment-derived BMP-mediated control 
of epithelial proliferation thus seems to promote progression. 

More direct evidence for microenvironmental control of CRC comes 
from mouse models for human juvenile polyposis and Peutz—Jeghers 
syndrome, which are both characterized by hamartomatous polyps. 
Blockade of microenvironmental BMP4 signals by ectopic expression 
of noggin (Fig. 1c), but also Smad4 deletion in mouse T cells, results in 
juvenile-polyposis-like polyp formation”. Similarly, a germline Lkb1 
(also known as Stk11) mutation or an Lkb1 mutation specifically in 
mesenchymal cells induces a Peutz—Jeghers-like syndrome in mice, with 
a prominent stromal compartment™. This points to a dominant role for 
the microenvironment in this CRC subtype. 

In humans, the strongest support for environmental control of CRC 
comes from the long-standing observation that chronic inflammation 
in patients with Crohn’ disease or ulcerative colitis predisposes them 
to cancer initiation in the gut”. In CRCs that are not associated with 
inflammation, there is also clear histopathological evidence for the 
function of immune cells in the stroma of malignancies. In addition, the 
administration of non-steroidal anti-inflammatory drugs lowers the risk 
of CRC-specific mortality in humans®, and specific cyclooxygenase 2 
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Figure 3 | Regulatory signals of colon CSCs provide new therapeutic 
targets. a, Colon CSCs receive a multitude of environmental cues. Signals 
that help to maintain CSCs include Notch and WNT. DLL4 stimulates 

Notch receptors on neighbouring cells and, together with B-catenin (B-cat), 
directs an immature transcription profile that promotes self-renewal. BMP4 
counteracts this self-renewal activity by binding to BMP receptors on CSCs, 
thereby interfering with WNT signalling and subsequently promoting 
differentiation. b, Colon CSCs in vivo are found intimately associated 

with HGF-producing myofibroblasts. HGF maintains colon CSCs ina 
stem-cell state and prevents differentiation. In addition, HGF produced 

by myofibroblasts can upregulate the WNT cascade in more differentiated 
tumour cells, thereby reinstalling CSC features (dedifferentiation). These 
signals that govern CSC behaviour have great therapeutic potential. Several 
possibilities to interfere with the self-renewal capacities of CSCs exist. These 
include inhibitors of the WNT pathway that prevent 6-catenin-dependent 
transcription (1); Notch inhibitors, preventing either the ligand from 
interacting with the receptor or the activation of the receptor (y-secretase 
inhibitors), which are currently under evaluation (2); BMPR agonists that 
activate differentiation programs and could be used to target CSCs (3); and 
inhibitors of the receptor kinase c-Met that could modulate the interaction 
between stromal cells and CSCs to prevent dedifferentiation (4). 

inhibitors cause reductions in intestinal polyps in Apc“ mice. This 
effect is at least partially mediated by a decrease in prostaglandin E, 
(PGE,) levels, which leads to a reduction in myofibroblast secretion of 
tumour-supporting factors such as hepatocyte growth factor (HGF) 
and amphiregulin”. In addition, lower PGE, levels decrease the influx 
of inflammatory cell types, such as tumour-associated macrophages and 
mast cells, which form an important tumour-promoting component of 
the cellular stroma”. 

In mice, chronic inflammation-associated CRC is often studied 
using the DSS and azoxymethane (AOM) model”. Application of the 
carcinogen AOM alone rarely induces intestinal neoplasia in mice but, 
when followed by the colitis inducer DSS, frequent tumours occur in 
the colon. In these mice, the invading myeloid cells produce a plethora 
of growth factors and cytokines, of which tumour-necrosis factor-a, 
interleukin-6 and TGF-f seem to be crucial for tumour initiation and 
progression”. These pro-inflammatory factors stimulate the inhibitor 
of xB kinase (IKK)-nuclear factor kB (NF-«B) pathway in epithelial 
cells, as shown by specific IKK-6 deletion, which leads to increased 
proliferation, reduced apoptosis and enhanced tumour incidence” and 
confirms a role for microenvironmental factors in the onset of disease. 
Similarly, MyD88 deletion decreases adenoma formation, indicating 
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that stimulation of Toll-like receptors by the microflora in the gut 
can directly enhance tumorigenesis”. These observations from both 
mouse models and human disease indicate that the (inflammatory) 
microenvironment has a prominent role in the onset and progression 
of CRC by supporting genetic mutations within epithelial cells (Fig. 2). 


Cancer stem cells 

An important downside of modelling cancer in mouse models is the 
fact that the tumours that arise are relatively genetically homogeneous, 
whereas human CRCs are thought to contain as many as 80 mutations, 
which probably vary from cell to cell’’. This genetic heterogeneity 
within a tumour generates an added dynamic interaction” that is hard 
to model in mice. Furthermore, the past decade has seen a shift in the 
way tumours are perceived, and the now widely accepted model is that 
tumours contain a small population of self-renewing CSCs, as well as a 
large compartment of more differentiated tumour cells’. It is important 
to realize that this hierarchy is not identical to genetic heterogeneity, 
but instead adds a further layer of complexity. CSCs are defined 
operationally by their capacity to (xeno)transplant a tumour and thereby 
generate a phenocopy of the original human malignancy, including all 
differentiated progeny”. In contrast to the more differentiated tumour 
cells, CSCs are thus suggested to embody the driving force within a 
tumour — that is, to contain the capacity to self-renew, expand and 
differentiate’. An increasing number of both solid and lymphoid 
malignancies are shown to contain CSCs (see ref. 3 for review). 
However, there is still considerable disagreement on the existence or 
importance of CSCs in human tumours. This is partly due to the lack of 
unique markers for these cells that can be used to directly identify CSCs 
in human malignancies and thereby remove the necessity to perform 
(xeno)transplantation assays. Moreover, our current observations 
indicate that CSC characteristics can also be bestowed on differentiated 
tumour cells when exposed to the right microenvironment (see later). 
The CSC theory may thus be more complex than originally anticipated, 
but it is nonetheless clear that the variation in tumour cell differentiation 
holds important implications for our understanding of tumours. For 
instance, several studies have indicated that CSCs in CRC are more 
resistant to therapy than differentiated tumour cells are’ which has 
led to the assumption that CSCs are crucial targets in therapy and, more 
importantly, may be the source of relapsing tumours””. In addition, 
we believe that the current data suggest that cellular hierarchy within 
CRC is maintained, at least in part, by microenvironmental factors 
regulating stemness and differentiation, much like normal homeostasis. 
In agreement, mouse adenomas that originate after inactivation of 
APC in LGRS* ISCs do not simply show an expansion of LGR5’ cells, 
but instead show a distinct subpopulation of tumour cells positive for 
this marker*’. This observation suggests that a hierarchy is established 
in adenomas and that differentiation towards LGR5° cells occurs, 
potentially as a result of normal differentiation cues. This concurs 
with earlier observations that adenoma cells differentiate into goblet 
cells after Notch inhibition with y-secretase inhibitors”, in line with 
the physiological role for Notch in proliferation and differentiation. 
Together, these findings suggest that mouse adenoma cells respond, at 
least partially, to their normal environmental cues, thereby maintaining 
a hierarchy within the developing tumour. 

In human CRC, CSCs are defined using CD133, CD166, CD44 and 
CD24 cell-surface markers””**’*, whereas differentiated tumour cells 
express markers normally present in differentiated colon epithelium. 
Importantly, even a single colon CSC can generate a fully differentiated 
tumour after xenotransplantation, proving that CSCs have a 
multilineage differentiation capacity that gives rise to all differentiated 
progeny”, but above all they seem to be hardwired (epi)genetically 
such that the offspring reshapes the original human tumour. This is not 
unique to CRCs but widely observed for different solid malignancies, 
and has also been shown at the single-cell level in glioblastoma”, 
breast cancer*’ and some haematopoietic malignancies. We believe 
that this is partly regulated by stromal cells that are rapidly attracted 
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and form an essential part of CRCs in mouse models, as well as human 
malignancies. A recent study showed that tumour cells located next to 
or within myofibroblast-rich regions have a much higher incidence of 
nuclear-localized B-catenin®, directly arguing for microenvironment- 
modulated WNT signalling. In agreement, human colorectal CSCs 
can be defined on the basis of high WNT signalling activity, and 
preferentially localize to myofibroblasts in xenotransplants*. Our data 
point to HGF as the myofibroblast-derived signal that, at least in part, 
orchestrates this intimate relationship and enhances WNT activity”. 
Conversely, whether myofibroblasts also receive regulatory input 
from epithelial tumour cells, similar to the situation in normal crypt 
homeostasis, remains to be determined. 

These data clearly ascertain a role for the WNT pathway in CRC 
stemness, but evidence for the involvement of other morphogenetic 
pathways also exists (summarized in Fig. 3). Notch inhibition with an 
antibody against the Notch ligand DLL4 results in human colon CSC 
differentiation, reduction of CRC growth in a xenotransplantation 
model and chemosensitization®’. Of note, this is at least in parta 
tumour-intrinsic DLL4 signal as it is blocked by a human-specific 
anti-DLL4 antibody”. Recent findings also indicate that BMP4 can aid 
in human CSC differentiation, much like it does in normal epithelial 
progenitors, and thereby induce the chemosensitivity of tumours”. 
Unlike normal homeostasis, in which BMP4 is expressed by the 
mesenchyme, BMP4 is expressed by the differentiated tumour cells and, 
as such, imposes a feedback mechanism”. Finally, a role for Hedgehog in 
CSCs and tumorigenesis has also been proposed, as progression towards 
metastasis seems to be associated with a switch from WNT to Hedgehog 
signal dependency”. However, in mouse models, both activation and 
inhibition of Hedgehog signalling has been shown to prevent tumour 
growth*’”’, making the role for Hedgehog less clear-cut. 

From these findings, we propose a model encompassing a hierarchical 
organization of CRCs, which is notably similar to normal homeostasis 
(Fig. 2). In this model, the Notch pathway, together with high WNT 
levels that result from mutation and myofibroblast-derived factors, 
promote cancer stemness. The BMP pathway counteracts this self- 
renewal mechanism and drives tumour cell differentiation (Fig. 3). 
Whether this concept depends on the nature of the mutations present 
ina specific CRC or is influenced by the stage of disease remains to be 
established. Nevertheless, the current data lend strong support to the 
influence of the microenvironment on CSC maintenance and tumour 
growth. In analogy to the normal ISC niche, we conclude that there is 
a CSC niche that is probably composed of a combination of stromal 
cells and more differentiated progeny, and delivers crucial signals to 
the CSCs. 


Outlook 

Understanding the signals that regulate tumour maintenance obviously 
serves one major purpose — the improvement of cancer therapies (for 
examples, see Fig. 3). Because colon CSCs seem subject to a careful 
interplay of extracellular cues, in the coming years this idea should be 
evaluated in defined tumour models that not only depend on genetically 
modified mice, but also extend findings to genetically heterogeneous 
human tumours containing a hierarchical organization. Only then can 
we accurately assess whether morphogenetic pathways and factors 
derived from stromal myofibroblasts modulate cancer stemness and 
subsequently influence tumour growth (Fig. 3). In addition, such 
models should provide us with more insight into the flexibility of the 
hierarchical organization in tumours. The current model for normal 
intestinal homeostasis is that once epithelial cells have lost their 
stemness, they will terminally differentiate. However, it is clear that 
dedifferentiation can easily occur in tumour settings. That is, under 
the influence of stromal myofibroblasts, more differentiated tumour 
cells can reacquire CSC features”, suggesting that the CSC phenotype 
is more fluid than initially proposed. Similar observations have been 
derived for breast cancer cells, which, after induction of the epithelial- 
mesenchymal transition, generate CSCs from more differentiated 
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tumour cells”. Such flexibility challenges the idea that targeting CSCs 
would be sufficient for efficient tumour control. The dominant role of 
microenvironment-derived HGF in the generation of CSCs in CRC 
indicates that exogenous factors derived from the mesenchyme can 
provide potent tumorigenic signals’. As CSCs are more resistant to 
therapy, it is not difficult to appreciate how the induction of CSCs by 
the microenvironment directly influences treatment outcome. Future 
research to devise new therapeutic strategies should therefore focus on 
these microenvironmental interactions, which could prove to be the 
Achilles’ heel of cancer. = 
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Human nutrition, the gut microbiome 
and the immune system 


Andrew L. Kau'*, Philip P. Ahern'*, Nicholas W. Griffin', Andrew L. Goodman't & Jeffrey I. Gordon! 


Marked changes in socio-economic status, cultural traditions, population growth and agriculture are affecting diets 
worldwide. Understanding how our diet and nutritional status influence the composition and dynamic operations of 
our gut microbial communities, and the innate and adaptive arms of our immune system, represents an area of scientific 
need, opportunity and challenge. The insights gleaned should help to address several pressing global health problems. 


any recent reviews have described the known interactions 
Meee the innate and adaptive immune system and the 

tens of trillions of microbes that live in our gastrointestinal 
tracts (known as the gut microbiota). In this Perspective, we emphasize 
how the time is right and the need is great to understand better the 
relationships between diet, nutritional status, the immune system and 
microbial ecology in humans at different stages of life, living in distinct 
cultural and socio-economic settings. 

This is a timely topic for many reasons. There is enormous pressure 
to devise ways to feed healthy foods to a human population whose size 
is predicted to expand to 9 billion by 2050. The solutions will have to 
address the challenges of developing sustainable forms of agriculture in 
the face of constrained land and water resources’. There is also a need 
to develop translational medicine pipelines to define more rigorously 
the nutritional value of foods that we consume and that we imagine 
creating in the future. These pipelines are required to evaluate health 
claims made about food ingredients. Increasing evidence shows that 
the nutritional value of food is influenced in part by the structure and 
operations of a consumer's gut microbial community, and that food, in 
turn, shapes the microbiota and its vast collection of microbial genes (the 
gut microbiome) (see, for example, refs 2 and 3). Therefore, to define the 
nutritional value of foods and our nutritional status better, we need to 
know more about our microbial differences and their origins, including 
how our lifestyles influence the assembly of gut microbial communities 
in children, and about the transmission of these communities within 
and across generations of a kinship’. We are learning how our gut 
microbial communities and immune systems co-evolve during our 
lifespans, and how components of the microbiota affect the immune 
system. We are also obtaining more information about how our overall 
metabolic phenotypes (metabotypes) reflect myriad functions encoded 
in our human genomes and gut microbiomes. These observations raise 
the question of how the metabolism of foods we consume by the gut 
microbial community affects our immune systems. 

The link between infections that occur within and outside the gut 
and the development of nutritional deficiencies has been emphasized 
for many years. In turn, poor nutrition increases the risk of infection. 
Nonetheless, there is still a dearth of mechanistic information that 
explains these observations. Furthermore, only four years remain to 
achieve the United Nations eight Millennium Development Goals 
(http://www.undp.org/mdg/). Two of these goals relate to human 
nutrition: one seeks to eradicate extreme poverty and hunger, and 


another aims to reduce the under-five mortality rate by two-thirds. 
Up to 1 billion people suffer from undernutrition of varying degrees, 
including ‘silent’ or asymptomatic malnutrition (http://www.fao.org/ 
publications/sofi/en/), making this condition an enormous global 
health problem. Of the ~10 million children under the age of 5 who 
die every year, undernutrition contributes in some fashion to more 
than 50% of these deaths’. Sadly, children who survive periods of severe 
undernutrition can suffer long-term sequelae, including stunting and 
neurodevelopmental deficits®. Moreover, the effects of undernutrition 
can be felt across generations. Undernourished mothers suffer higher 
rates of morbidity and mortality, and are more likely to have low-birth- 
weight children, who have an increased risk of developing type 2 diabetes, 
hypertension, dyslipidaemia, cardiovascular pathology and obesity as 
adults’. A testable hypothesis is that the gut microbiota may contribute to 
the risk and pathogenesis of undernutrition through effects on nutrient 
metabolism and immune function (Fig. 1). Similarly, the experience of 
undernutrition in childhood could affect the development of metabolic 
capacities by this microbial ‘orgar’ in ways that result in persistent 
metabolic dysfunction or inadequate function, thereby contributing to 
the sequelae of malnutrition. Finally, if we define malnutrition as the 
inadequate or excessive consumption of dietary ingredients leading to 
the development of disease, then we also need to consider the alarming 
epidemic of obesity that is sweeping the world and its relationship to the 
gut microbiome and the immune system. 


The marriage of metagenomics and gnotobiotics 

We believe that the ‘marriage’ of two approaches — one involving culture- 
independent (metagenomic) methods for describing the gut microbiota or 
microbiome and the other involving gnotobiotics (the rearing of animals 
under germ-free conditions, with or without subsequent exposure during 
postnatal life or adulthood to a microbial species or species consortium) 
— is a potentially powerful way to address several questions about 
the relationships between diet, nutritional status, the assembly and 
dynamic operations of gut microbial communities, and the nature of 
the interkingdom communications between the gut microbiota and the 
host (including host-microbial co-metabolism, and the co-evolution of 
the immune system**”). Without dismissing caveats related to the use 
of gnotobiotic models (see later), in this Perspective we describe ways 
that may be useful for joining gnotobiotics and metagenomic methods 
to compare the functional properties of various types of gut microbial 
community, to explicitly test or generate hypotheses, and to develop new 
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Figure 1 | A schematic of the proposed relationships between the 

gut microbiota, the immune system and the diet, which underlie the 
development of malnutrition. Undernutrition is associated with several 
defects in the innate and adaptive immune systems, which, in turn, are 
associated with increased predisposition to diarrhoeal illnesses. Recurrent 
(enteric) infections predispose to macronutrient and micronutrient deficiencies, 
as well as impaired intestinal mucosal barrier function”. These factors lead to 
acycle of further susceptibility to infection and worsening nutritional status. A 
confounding problem is that vaccines designed to protect children from certain 
pathogens (including enteropathogens) show poor efficacy in areas of the world 
where poor nutrition is rampant”. One testable hypothesis is that the microbiota 
contributes to disease risk and pathogenesis. Diet shapes gut microbial 
community structure and function, and the microbiota adapts in ways that 
promote nutrient processing; the ability of the microbiota to process a given diet 
affects the nutrient and energetic value of that diet. The microbiota and immune 
systems co-evolve: malnutrition affects the innate and adaptive immune systems 
as well as the microbiota. The microbiota acts as a barrier to enteropathogen 
infection; this barrier function may be disrupted by malnutrition, as well as 

by perturbations in immune system function. The microbiota affects nutrient 
processing by the host, including the expression of host genes involved in 
nutrient transport and metabolism. 


experimental (and computational) approaches that together inform the 
design, execution and interpretation of human studies. 


What is changing about what we eat? 

Changes in dietary consumption patterns affect many aspects of human 
biology. To fully understand the determinants of nutritional status, we 
need to know what people are eating and how these diets are changing. 
Unfortunately, accurate information of this type is hard to obtain, and 
when available it generally covers a relatively limited time period. As 
a corollary, searchable databases that effectively integrate information 
obtained from the surveillance efforts of many international and 
national organizations (such as the World Health Organization, the UN 
Food and Agriculture Organization and the United States Department 
of Agriculture (USDA) Economic Research Service) are needed to 
monitor changing patterns of food consumption in different human 
populations. Analysis of USDA data that track the availability of more 
than 200 common food items between 1970 and 2000 shows that diets in 
the United States have changed in terms of both the overall caloric intake 
and the relative amounts of different food items (http://www.ers.usda. 
gov/Data/FoodConsumption). Linear regression of total caloric intake 
over time shows that the average number of kilocalories consumed per 
day increased markedly over this 30-year period (R* = 0.911, P< 10). 
This is consistent with estimates from the US National Health and 
Nutrition Examination Survey (NHANES), which indicate that adult 
men and women increased their daily caloric intake by 6.9% and 21.7%, 
respectively, during the same period”. If total caloric intake is analogous 
to ‘primary productivity’ in macro-ecosystems, in which primary 
productivity is used as a proxy for available energy, then increasing the 
amount of energy input from the diet would be predicted to affect the 
number of microbial species living in the gut ofa single host, as well 
as the magnitude of the compositional differences that exist between 
different hosts or even different regions of a single gut (see ref. 11 for 
discussions about the mechanisms underlying productivity—species 
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richness relationships in macro-ecosystems). Intriguingly, metagenomic 
studies of bacterial composition in the faecal microbiota of obese 
and lean twins living in the United States have shown that obesity is 
associated with decreased numbers of bacterial species’. Reductions 
in diversity could affect community function, resilience to various 
disturbances and the host immune system. 

During the past 30 or so years, the North American diet has also 
shifted in terms of the relative contributions of different foods to total 
energy intake. Since 1970, two dietary ‘epochs’ can be distinguished 
based on the contribution of grains to overall calories (the mean 
increase in daily carbohydrate intake for men and women during this 
period was 62.4 g and 67.7 g, respectively’). The consumption of other 
food items has also changed: Spearman’s rank correlations between food 
availability and time, followed by adjustments of P values to reflect false 
discovery rates, show that the representation of 177 out of 214 items 
tracked by the USDA has increased or decreased significantly in US 
diets since 1970. For example, Americans now eat less beef and more 
chicken, and corn-derived sweeteners have increased at the expense 
of cane and beet sugars. Furthermore, methods of food modification 
and preparation have changed. Comparable data are needed for other 
countries with distinct cultural traditions, including countries in which 
people are undergoing marked transformations in their socio-economic 
status and lifestyles. 

We know from metagenomic studies of the human gut microbiota 
and microbiome that early postnatal environmental exposures have an 
important role in determining the overall phylogenetic structure of an 
adult human gut microbiota. The assembly of the microbiota towards 
an adult configuration occurs during the first three years of life’’, and 
features of the organismal and gene content of gut communities are 
shared among family members and transmitted across generations of a 
kinship‘. We also know that dietary habits influence the structure of the 
human genome. For example, populations that consume diets high in 
starch have a higher copy number of the salivary amylase gene (AMY1) 
than those consuming low-starch diets’’. We know that these habits also 
affect the gut microbiome. A wonderful illustration of the latter point 
is provided by a microbial B-porphyranase in Japanese populations. 
Zobellia galactanivorans is a marine member of the Bacteroidetes 
that can process porphyran derived from marine red algae belonging 
to the Porphyra genus. Homologues of porphyranase genes from 
Z. galactanivorans are present in the human gut bacterium Bacteroides 
plebeius and are prominently represented in the microbiomes of 
Japanese but not North American citizens. This finding led to the 
suggestion that porphyranase genes from Z. galactanivorans or 
another related bacterium were acquired, perhaps through horizontal 
gene transfer, by a resident member of the microbiota of Japanese 
consumers of non-sterile food, and that this organism and gene were 
subsequently transmitted to others in Japanese society’. Together, these 
observations lead to the notion that systematic changes in overall dietary 
consumption patterns across a population might lead to changes in the 
microbiome, with consequences for host nutritional status and immune 
responses. 

We also know, from work in gnotobiotic mice that have received 
human faecal microbial community transplants, that the relative 
abundances of different bacterial species and genes in the gut microbiota 
are highly sensitive to different foods’. Gnotobiotic mice containing 
defined collections of sequenced human gut symbionts or transplanted 
human faecal microbial communities could provide an approach for 
modelling the effects of different dietary epochs on the gut microbiota 
and on different facets of host biology. If the desired result is an account 
of the effects of individual food items or nutrients, then feeding the 
animals a series of defined diets, each with a different element removed 
or added, might be an appropriate strategy if the food ingredients for the 
epoch are known and available. If the focus is on the effects of overall 
differences in dietary habits within or between groups of humans, 
then diets should reflect the overall nutritional characteristics of the 
different groups and not merely be representative of a single individual. 
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Designing such diets requires detailed accounts of the identity and 
quantity of each food item consumed, ideally for a large number of 
people, as well as the methods used for food preparation. The US diet 
presents a rare opportunity for such an approach, because NHANES 
data sets (http://www.cdc.gov/nchs/tutorials/Dietary/) provide one-day 
dietary recall data at several time points since the early 1970s. 


Nutrient metabolism and the immune system 

The nexus between nutrient metabolism and the immune system occurs 
at many levels, ranging from endocrine signalling to direct sensing of 
nutrients by immune cells. 

Leptin signalling provides an example of these complex inter- 
relationships. Leptin regulates appetite and is a pleiotropic cytokine, 
maintaining thymic output and cellularity and promoting the 
dominance of T helper 1 (T,,1) cells over T,,2 cells’*'* while inhibiting 
the proliferation of T regulatory (T,,,) cells’. Low levels of leptin may 
account for the decreased cellular immunity associated with periods 
of nutrient deprivation’’. Leptin also affects innate immune cells, 
ranging from the promotion of neutrophil activation and migration to 
the activation of monocytes and macrophages"”. Elegant experiments 
using mice deficient in the leptin receptor in different cellular 
compartments showed a requirement for leptin signalling in intestinal 
epithelial cells to prevent severe disease after exposure to Entamoeba 
histolytica. Comparisons of db/db mice, which lack a functional leptin 
receptor, and their wild-type littermates demonstrated that leptin 
controls infectivity and prevents severe inflammatory destruction of 
the intestine, thereby affecting mortality’®. These studies were extended 
to mice with engineered mutations in the leptin receptor that are found 
in human populations (Tyr1138Ser and Tyr985Leu, both of which 
disrupt signalling). These mutations rendered mice more susceptible 
to E. histolytica infection’’. Leptin levels are significantly reduced in 
the sera of germ-free mice’. Moreover, obese, leptin-deficient (ob/ob) 
mice have marked differences in the taxonomic and genetic content 
of their gut microbial communities”. To our knowledge, the effects of 
leptin-receptor deficiency on the gut microbiota have not been reported. 
Nonetheless, leptin-receptor deficiency and E. histolytica pathogenesis 
provide a setting in which the intersections between the endocrine and 
immune systems, enteric infection and gut microbial ecology can be 
explored. 

The ability to use macronutrients is essential for the generation and 
maintenance ofa protective effector immune response. After stimulation 
through the T-cell receptor (TCR) and co-stimulation through CD28, 
the metabolic needs of T cells are met by a marked increase in the uptake 
and use of glucose and amino acids”. A deficiency in glucose uptake 
negatively affects numerous facets of T-cell function, with impairment 
of both proliferation and cytokine expression. Similarly, deficiencies 
in amino acids such as tryptophan, arginine, glutamine and cysteine 
reduce immune-cell activation. Furthermore, TCR stimulation in the 
absence of co-stimulation, which leads to T-cell anergy, has been linked 
to a failure to upregulate metabolic machinery associated with amino- 
acid and iron uptake”’”’, 

Short-chain fatty acids (SCFAs) provide one of the clearest examples 
of how nutrient processing by the microbiota and host diet combine 
to shape immune responses. SCFAs are end products of the microbial 
fermentation of macronutrients, most notably plant polysaccharides that 
cannot be digested by humans alone because our genomes do not encode 
the large repertoire of glycoside hydrolases and polysaccharide lyases 
needed to cleave the varied glycosidic linkages present in these glycans”. 
These missing enzymes are provided by the microbiome. The luminal 
concentration of intestinal SCFAs can be modified by the amount of 
fibre in the diet, which affects the composition of the microbiota™. In 
addition to acting as an energy source for the host, SCFAs exert notable 
effects on host immune responses. Low levels of butyrate modify the 
cytokine production profile of T,, cells” and promote intestinal epithelial 
barrier integrity”’, which in turn can help to limit the exposure of the 
mucosal immune system to luminal microbes and prevent aberrant 
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inflammatory responses. Production of another SCFA, acetate, by the 
microbiota promotes the resolution of intestinal inflammation by the 
G-protein-coupled receptor GPR43 (ref. 27). A recent study highlighted 
the important role of acetate production in preventing infection with 
the enteropathogen Escherichia coli (0157:H7). This effect was linked 
to the ability of acetate to maintain gut epithelial barrier function”. 
Intriguingly, SCFAs may regulate the acetylation of lysine residues”, 
a covalent modification that affects proteins involved in a variety of 
signalling and metabolic processes. The role of this covalent modification 
in modulating the activity of proteins intimately involved in innate and 
adaptive immune responses needs to be explored further. It is tempting 
to speculate that the covalent or non-covalent linkage of products of 
microbial metabolism to host proteins produced within the intestine, or 
at extra-intestinal sites, will be discovered and shown to have important 
regulatory effects. These different protein modifications could represent 
a series of mechanisms by which the microbial community metabotype 
is ‘imprinted’ on the host. 

If nutrients and derived metabolites reflect the functional activity of 
the microbiota, sensors of nutrient and metabolite availability can be 
considered akin to microbe-associated molecular patterns (MAMPs) 
that convey information about microbes to the host. Several families 
of innate receptors are involved in the recognition of MAMPs: these 
include Toll-like receptors (TLRs), inflammasomes, C-type lectins such 
as dectin-1, and RNA-sensing RIG-like helicases such as RIG-I and 
MDAS. The accompanying review by Maloy and Powrie (page 298) 
provides an overview of this area. We would like to emphasize that 
classical innate immune recognition pathways have evolved to survey 
the nutrient environment. TLR4 can sense the presence of free fatty 
acids”, whereas ATP is an important activator of the inflammasome”. 
Several other immune-cell-associated sensors couple information 
about the local nutrient or metabolite environment to the coordination 
of local immune responses. Examples are the serine/threonine kinase 
mammalian target of rapamycin (mTOR)”, double-stranded RNA- 
activated protein kinase (PKR)”’, the aryl hydrocarbon receptor 
(AHR)™, and various nuclear hormone receptors such as the liver-X- 
receptor and the peroxisome-proliferator-activated receptors (PPAR-a, 
PPAR-f and PPAR-y)”* (Table 1 and Fig. 2). The mTOR pathway is an 
example of how energy availability affects immune responses. mTOR 
is activated by phosphatidylinositol-3-OH kinase and the serine/ 
threonine kinase AKT, and is inhibited by AMP-activated protein 
kinase, which is a sensor of cellular energy resources. Genetic and 
pharmacological approaches (the latter using rapamycin) indicate 
that mTOR signalling affects both the innate and adaptive arms of 
the immune system — including maturation and effector activity of 
dendritic cells, inhibition of Tyeg-Cell development, promotion of the 
differentiation of T,,1, T,2 and T,;17 cells, regulation of CD8* T-cell 
trafficking and inhibition of memory T-cell formation””*. PKR couples 
the presence of free fatty acids to immune activation, and has been 
implicated in the pathogenesis of obesity in mice fed a high-fat diet, 
including their development of immuno-inflammatory and insulin- 
resistant phenotypes” (see below). AHR is activated by several agonists, 
including kynurenine — a product of tryptophan metabolism by 
indoleamine-2,3-dioxygenase””™. AHR modulates the differentiation 
of dendritic cells® and promotes T,;17-cell and T,..-cell differentiation 
and effector activity**'. Withdrawal of tryptophan and arginine 
controls immune responses’. The presence of an intact amino-acid 
starvation response in T cells is essential for the immunosuppressive 
activity of tryptophan depletion by indoleamine-2,3-dioxygenase™. 
This example illustrates how the ability of T cells to sense levels ofa 
nutrient (tryptophan) in its local environment, rather than using the 
nutrient solely as a fuel source, is an important determinant of cell fate. 
If the assessment of local nutrient levels or metabolites is an important 
feature in the immune decision-making process, and if the products 
of microbial metabolism are previously unappreciated agonists or 
antagonists of immune-cell receptors, then an important challenge is 
to devise in vitro and in vivo models, including genetically manipulable 
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gnotobiotic animals (such as mice or zebrafish), to identify the range 
of metabolites produced by a microbiota (and host) as a function of 
different defined diets. 


The case for micronutrients 

The intestinal microbiota can synthesize several vitamins involved in 
myriad aspects of microbial and host metabolism, including cobalamin 
(vitamin B,,), pyridoxal phosphate (the active form of vitamin B,), 
which is involved in several enzymatic interconversions in amino-acid 
metabolism, pantothenic acid (vitamin B,), niacin (vitamin B,), biotin, 
tetrahydrofolate and vitamin K. In addition to vitamin B,,, gut microbes 
produce a range of related molecules (corrinoids) with altered ‘lower 
ligands, including analogues such as methyladenine and p-cresol. More 
than 80% of non-absorbed dietary vitamin B,, is converted to these 
alternative corrinoids*”*. There is preliminary evidence to suggest that 
syntrophic relationships among members of the human microbiota, and 
the fitness of some taxa, may be based on the ability to generate, use or 
further transform various corrinoids**”. 

Folate and cobalamin produced by the gut microbiota could affect 
host DNA methylation patterns, whereas acetate produced by the 
microbial fermentation of polysaccharides could modify chromatin 
structure and gene transcription by histone acetylation. Thus, 
the inheritance of a mammalian genotype and intergenerational 
transmission of a microbiome — together with a complex dynamic in 
which the microbiome is viewed both as an epigenome and a modifier 
of the host epigenome during the postnatal period when host, host diet 
and microbial community co-evolve — could together shape human 
physiological phenotypes that are manifested during childhood or later 
in life. 

Numerous observational studies indicate that deficiencies in vitamins 
A, Dand Eand zinc can adversely affect immune function, particularly 
T-cell responses. Although a considerable body of work exists detailing 
the myriad effects of vitamins A, D and E on host immune responses, so 
far there is little evidence for a role of the microbiota in the biosynthesis 
or metabolism of these vitamins. However, stimulation of dendritic cells 
through TLR2 increases the expression of host genes associated with 
generation of the immunoactive form of vitamin A (retinoic acid), 
and enteric infection has been linked to vitamin A deficiency*”. 
Intriguingly, a recent study demonstrated that vitamin A deficiency 
leads to a complete loss of T,,17 cells in the small intestine of specific 
pathogen-free mice and an associated significant reduction in the 
abundance of segmented filamentous bacteria (SFB)*’ — a member 
of the Clostridiaceae family that drives intestinal T,,17 responses 
in mice’”*. Thus, vitamin A has the potential to modulate immune 
responses directly, by interacting with immune cells, or indirectly, by 
modulating the composition of the microbiota. 

The microbiota also affects the absorption of key minerals. Perhaps 
the best characterized micronutrient in terms of its interaction with 
both the microbiota and the immune system is iron. Iron-deficient 
mice are resistant to the development of experimental autoimmune 
encephalomyelitis, and have reduced delayed type hypersensitivity 
(also known as type IV hypersensitivity) responses and lower levels of 
IgM and IgG. Iron deficiency also impairs innate immune responses, 
as it is required for the respiratory burst®. Likewise, iron is an essential 
micronutrient for bacteria. Given the low solubility of Fe**, microbes 
have evolved the capacity to produce several high-affinity iron-binding 
siderophores. Microbes take up soluble Fe*'-siderophore complexes 
by several active transporters. Early studies in gnotobiotic animals 
showed a link between the gut microbiota and the development of 
iron deficiency. Germ-free but not conventionally raised rats become 
anaemic when fed a low-iron diet. The germ-free rats also show 
increased loss of iron in their faeces compared with their conventionally 
raised counterparts™. The iron balance that exists between host and 
microbiota is disturbed in a mouse model of Crohn's disease in which 
tumour-necrosis factor-a (TNF-a) expression is dysregulated: oral 
(but not parenteral) iron supplementation in these animals causes a 
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shift in the gut microbial community composition, as defined by 16S 
ribosomal-RNA-based surveys, and exacerbates their ileitis”>. 

Metagenomic methods need to be applied to delineate further the 
role of the microbiota in micronutrient deficiencies. Several questions 
remain, such as how iron deficiency affects the configuration of the gut 
microbiota and microbiome, including the production of siderophores. 
Iron repletion could return the microbiota/microbiome to a normal, 
pre-deficient state; alternatively, there may be persistent structural 
and functional perturbations that require continued nutritional 
supplementation to correct. There may be particular configurations of 
the microbiota/microbiome that predispose the host to iron or other 
types of micronutrient deficiency. The iron content of mother’s milk 
during postnatal life could also affect the assembly and metabolic 
operations of the microbiota. In principle, these questions can first 
be addressed in gnotobiotic mouse models, and also extended to 
macronutrient-deficient states. 


The microbiota and the immune system in obesity 
Obesity, metabolic syndrome and diabetes illustrate the role that the 
diet-microbiota~immune axis has in shaping human systems biology. 
Although the marked increase in obesity worldwide can be linked to an 
ever-growing trend towards excessive caloric intake, the microbiota has 
also been implicated in obesity. Studies of a cohort of twins living in the 
United States indicate that the bacterial phylogenetic composition of the 
faecal microbiota and the representation of microbial genes involved 
in several aspects of nutrient metabolism in the faecal microbiome are 
different in lean versus obese twin pairs*. Different research groups 
using different primers to amplify bacterial 16S ribosomal RNA genes 
for culture-independent analyses of gut microbial ecology, and studying 
different human populations consuming different diets, have reported 
varying results concerning the bacterial phylogenetic composition of 
the microbiota in lean versus obese individuals”. 

Evidence that a link exists between the microbiota and obesity comes 
from transplant experiments in gnotobiotic mice. Gut communities 
from leptin-deficient, ob/ob, mice or mice with diet-induced obesity 
induce a greater increase in adiposity when transferred to germ-free 
recipients than do communities from wild-type littermates or mice 
that have been given a healthy, calorically less-dense diet’. Germ-free 
mice are resistant to diet-induced obesity. Further studies have shown 
that the gut microbial community regulates the expression of genes 
that affect fatty-acid oxidation and fat deposition in adipocytes. For 
example, production of the secreted lipoprotein lipase (LPL) inhibitor 
angiopoietin-like protein 4 (ANGPTL4; also known as fasting-induced 
adipose factor) is suppressed by the microbiota: studies of germ-free 
and conventionalized wild-type and Angptl4" mice established that 
microbiota-mediated suppression of gut epithelial expression of this 
secreted LPL inhibitor results in increased LPL activity and fat storage 
in white adipose tissue’”*. Moreover, Tir5-deficient mice have a gut 
microbiota with a distinct configuration from that encountered in wild- 
type littermate controls. When their gut microbiota is transplanted to 
wild-type, germ-free recipients, food intake is increased compared with 
recipients of microbiota transplants from wild-type mice: increased 
adiposity and hyperglycaemia ensue”. The mechanism underlying 
the increase in food consumption remains to be defined, although 
the authors of the study speculate that inflammatory signalling may 
desensitize insulin signalling in ways that lead to hyperphagia. 

Obesity in mice and humans is associated with the infiltration of 
adipose tissue by macrophages, CD8* T cells® and CD4' T cells®*, and 
with the expression of inflammatory cytokines and chemokines such as 
interleukin-6 (IL-6), IL-17, TNF-a, CC-chemokine ligand 2 (CCL2) and 
interferon-y”. By contrast, adipose tissue in lean mice is home toa 
population of immunosuppressive T,,, cells that prevents inflammation™. 
Mice deficient in CC-chemokine receptor 2 (Ccr2) and with obesity 
induced by consumption of a high-fat diet have reduced macrophage 
infiltration of the adipose tissue and improved glucose tolerance relative 
to Ccr2-sufficient controls®, highlighting the role of factors in recruiting 
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Table 1 | Metabolite sensors associated with immune cells 


Sensor Agonist mmune response affected 

mTOR S1P nhibits T,..-cell differentiation and 
maintenance®® 

Leptin Promotes T,,1-cell differentiation’® 
Leptin nhibits T,..-cell proliferation and 
function’”*? 

AHR 6-formylindolo[3,2-b] T,17-cell differentiation and IL-22 
carbazole production by T,17 cells*°*! 
2,3,7,8-tetrachlorodibenzo- Promotes T,.,-cell induction’? 
p-dioxin 
Kynurenine Promotes T,.g-cell induction®® 

PKR Free fatty acids; Promotes insulin resistance through 
palmitic acid inhibitory phosphorylation of IRS-1 

(ref. 33) 

RAR-RXR_ Retinoic acid Promotes intestinal T-cell homing”; 
promotes T,,,-cell generation”’; 
promotes T-cell proliferation”; 
promotes T,,2-cell differentiation over 
Ty1 cells 

VDR-RXR_ 1,25(OH), vitamin D3 Inhibits lymphocyte proliferation”; 
inhibits interferon-y, IL-17 and IL-2 
expression”’; promotes emergence 
of T,., cells®; drives antimicrobial 
peptide expression?’; promotes T-cell 
expression of CCR10 (ref. 98) 

GPR120_ _w-3 Fatty acids Inhibits inflammatory cytokine 
production and chemotaxis by 
macrophages” 

GPR43 Acetate Promotes resolution of intestinal 
inflammation?” 

P2X ATP Promotes T,,17-cell generation! 

receptors 


AHR, aryl hydrocarbon receptor; mTOR, mammalian target of rapamycin; PKR, double-stranded 
RNA-dependent protein kinase; RAR, retinoic acid receptor; RXR, retinoid X receptor; 
S1P, sphingosine-1-phosphate; VDR, vitamin D receptor. 


inflammatory immune cells and their associated pro-inflammatory 
products in the pathogenesis of metabolic abnormalities associated with 
obesity. Blockade of TNF-a” or expanding T,,., cells using anti-CD3 
monoclonal antibody” prevents the onset of obesity-associated insulin 
resistance in a mouse model of diet-induced obesity. 

Inflammation drives the development of insulin resistance through 
the phosphorylation of insulin receptor 1 by TNF-a activation of 
c-Jun amino-terminal protein kinase 1 (JNK1), and perhaps inhibitor 
of nuclear factor-«B kinase-B (IKK-6), protein kinase C and mTOR. 
Whereas signalling by the adaptor protein MyD88 promotes the 
development of type 1 diabetes in pathogen-free NOD (non-obese 
diabetic) mice, germ-free Myd88 " NOD animals are susceptible to this 
disorder. These findings suggest that particular intestinal microbial 
configurations can promote or prevent inflammatory immune 
responses that drive metabolic dysfunction. 

Mice fed a high-fat diet have increased serum levels of lipopoly- 
saccharide”. Furthermore, genetically obese mice that are deficient in 
leptin or its receptor have reduced intestinal barrier function™. As noted 
earlier, SCFAs produced by microbial fermentation affect the barrier. 
Thus, it will be important to assess whether obese humans show similar 
reductions in barrier function. A high-fat diet alters the structure of the 
intestinal microbiota, potentially leading to a reduction in gut barrier 
integrity. The enhanced translocation of microbes and/or their antigens 
may result in increased microbial antigen load at extra-intestinal 
sites, enhanced immune stimulation and the development of insulin 
resistance. Furthermore, nutrients are known to activate inflammatory 
arms of the immune system directly”. The capacity of the intestinal 
microbiota to shape immune responses outside the intestine is well 
documented. Studies have highlighted the ability of the microbiota and 
specifically SFB to support the development of autoimmune arthritis” 
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and experimental autoimmune encephalomyelitis”, both of which have 
been linked to excessive T,,17 responses. 

Unfortunately, the spatial relationships between members of the 
microbiota and their proximity to elements of the gut-associated 
immune system in healthy individuals or individuals with mucosal 
barrier dysfunction are not well understood. Gnotobiotic mouse 
models of obesity may help to provide important insights about the 
biogeography of microbial communities along the length and width of 
the gut, including whether microbial consortia occupy ectopic sites that 
could affect the development and perpetuation of barrier dysfunction 
(such as in the crypts of Lieberkithn, where multipotential gut stem 
cells reside — as described by Medema and Vermeulen (page 318)). 
Newer methods, such as combinatorial labelling and spectral imaging 
fluorescence in situ hybridization (CLASI-FISH)”, offer a great deal 
of promise for characterizing the spatial features of microbe-microbe 
and microbe-host cell interactions in the gut mucosa, especially if they 
are applied to gnotobiotic models composed of defined collections of 
sequenced microbes. 


Undernutrition and environmental enteropathy 
Undernutrition can have many clinical manifestations ranging 
from mild, asymptomatic micronutrient deficiencies to severe, life- 
threatening conditions such as kwashiorkor and marasmus. Estimates 
indicate that the implementation of current ‘best practice’ interventions 
— including increasing the time of breastfeeding, supplementing diets 
with zinc and vitamins, hygiene measures such as improving hand 
washing, and optimizing the treatment of acute severe malnutrition — 
could reduce mortality during the first three years of life by only 25%, 
even if there is almost perfect compliance’. Several environmental and 
genetic factors have long been postulated to influence the development 
of moderate to severe forms of malnutrition”, but the underlying 
mechanisms remain poorly defined. Food availability, although a 
major factor, is not the only contributor. For example, in Malawi, the 
concordance for severe malnutrition between twins within the same 
household who are fed similar diets is only 50% (M. Manary, personal 
communication). This observation raises several questions. Do 
different configurations of the gut microbiota predispose one co-twin 
to kwashiorkor or marasmus? The effect of nutrient deficiency, in 
either the mother or her child, on the configuration of the microbiota 
and microbiome in the developing gut is not clear. It is possible that 
nutrient deficiency in the mother affects the assembly of the microbiota 
by causing changes in the mother’s gut microbiota or in the nutrient 
and immune content of her breast milk. Both the microbiota and milk 
are transmitted to the infant, yet we have much to learn about how the 
biochemical and immunological features of breast milk change, and how 
breast milk and the infant microbiota ‘co-evolve’ during the suckling 
period when a mother is healthy or malnourished (see below). 
Identifying how malnutrition affects the gut’s microbiome may prove 
to be very important for improving many associated clinical disorders. 
Malnutrition could delay the maturation of the gut’s microbial metabolic 
organ or skew it towards a different and persistent configuration that 
lacks the necessary functions for health or increases the risk of diseases, 
including immuno-inflammatory disorders. Nutrient repletion may 
return the microbiota/microbiome to a ‘normal pre-deficient state; 
alternatively, structural and functional perturbations may persist, which 
require continued nutritional supplementation to correct. There may be 
microbiome configurations that correlate with vaccine responsiveness“. 
Studies of severe forms of malnutrition indicate that these patients 
often have many characteristics of environmental enteropathy”. 
Also known as tropical sprue or tropical enteropathy, environmental 
enteropathy is a poorly characterized chronic inflammatory disease 
that mainly affects the small intestine. The disorder afflicts individuals 
who reside in areas with poor sanitation and who have high exposure 
to faecal-contaminated water and food. As an example, Peace Corps 
volunteers returning to the United States from such areas would report 
a history of diarrhoeal disease and have signs and symptoms of chronic 
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Figure 2 | Metabolite sensors that help to coordinate immune 

responses. Immune-cell-associated sensors use information about the 

local nutrient or metabolite milieu to organize local immune responses. (1) 
The dietary intake of macronutrients and micronutrients shapes microbial 
community structure (2), which, in turn, changes the nutritional value of the 
consumed food. (3) Unmodified dietary components are absorbed in the 
intestine, where they can interact with immune cells. (4) Microbial signals in 
the form of microbe-associated molecular patterns (MAMPs) modify local 
mucosal immune responses through innate signalling pathways such as the 
inflammasome or TLRs. Inflammasomes recruit the adaptor protein apoptosis- 
associated speck-like protein containing a CARD (ASC), which promotes 
binding of caspase, which in turn, cleaves pro-IL-16 to IL-1. (5) Microbe- 
modified dietary components (such as acetate produced by the fermentation 
of polysaccharides) provide signals by which the immune system can monitor 
the metabolic activities of the microbiota. (6) Vitamin A can modify the 
representation of segmented filamentous bacteria (SFB) in the mouse gut 
microbiota, and is an example of a micronutrient directly modifying intestinal 
microbial ecology. SFB induce the differentiation of T,,17 cells. DC, dendritic 
cell; LXR, liver-X-receptor; RAR, retinoic acid receptor; RXR, retinoid X 
receptor; VDR, vitamin D receptor. 


malabsorption and nutritional deficiencies’*. The malabsorption 
associated with environmental enteropathy is often subtle in 
children, manifesting itself clinically only as stunting due to chronic 
undernutrition”*. The breakdown in intestinal mucosal barrier function 
in this disorder can lead to increased susceptibility to enteropathogen 
infections. Recurrent infections predispose individuals to nutritional 
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deficiencies and further compromise barrier function, leading to 
a vicious cycle of further susceptibility to infection and worsening 
nutritional status”. 

Efforts to break this cycle have focused on vaccines that could prevent 
infection. However, there is significant heterogeneity in the responses to 
vaccination between children living in highly Westernized societies and 
children living in certain developing countries. Oral rotavirus vaccine 
elicits responses in more than 95% of children living in Westernized 
societies but in only 49% in Malawi’*. Lower oral polio vaccine efficacy 
has been reported in populations with greater enteric disease burden”. 
Studies in Chilean children have demonstrated a negative correlation 
between oral cholera vaccine responses and small bowel bacterial 
overgrowth™. In addition, patients with coeliac disease, which shares 
phenotypic features with environmental enteropathy, can have a blunted 
response to parenteral hepatitis B vaccination, but only when their 
disease is active. 

Traditionally, the most definitive test for environmental enteropathy 
has been small intestinal biopsy. Biopsies typically show reductions 
in small intestinal villus height, increased numbers of intraepithelial 
lymphocytes, and increased infiltration of the underlying lamina 
propria by T cells with a predominant T,;] phenotype”. Some of these 
features are found in patients with coeliac disease, in which a luminal 
antigen (gliadin) drives a T-cell response that, in turn, results in epithelial 
destruction, reduced absorptive surface area and malabsorption”. 
Unlike coeliac disease, the antigens that drive the host immune response 
in environmental enteropathy are unknown, but there may be an 
association with certain HLA alleles, such as HLA-Aw3] (ref. 82). 

The pathological events that lead to the development of environmental 
enteropathy are poorly understood, in part because of the absence of a 
robust set of readily assayed biomarkers that would improve the ability 
to diagnose, classify and potentially subcategorize individuals that show 
the broadly defined clinical manifestations that define this disorder. 
Epidemiological data showing a strong association of environmental 
enteropathy in areas with poor sanitation, as well as the occasional 
epidemic spread of the disease and its responsiveness to antibiotic 
treatment, reinforce the long-standing belief that there is an ‘infectious’ 
aetiology. Although cultures of jejunal aspirates from individuals 
with environmental enteropathy have suggested contamination of 
the proximal small bowel by aerotolerant Gram-negative bacteria’, 
no single pathogen or set of pathogens has been identified in the gut 
microbiota of most affected individuals. There is a distinct possibility 
that this enteropathy is not the result of a single pathogen but rather the 
result of colonization with microbial consortia that are inflammogenic 
in the context of a susceptible host. In fact, what constitutes a normal 
immune repertoire in a healthy gut probably varies considerably 
depending on environmental exposures and the configuration of a 
microbiota. Moreover, most metagenomic studies of the microbiota have 
focused on members of the superkingdom Bacteria, which dominate 
these communities. Other tools need to be developed so that they can 
be extended to viral, archaeal and eukaryotic components. The latter 
group includes parasites that compete for nutrients within the intestines 
of infected individuals. Parasites can interact directly with bacterial 
members of the microbiota during their life cycle in ways that promote 
hatching of parasite eggs, and can shape immune function through 
factors such as excretory-secretory products, which have been shown 
to modulate cytokine production, basophil degranulation and immune- 
cell recruitment and to interfere with TLR signalling™. 

It seems reasonable to posit that individuals living in regions with 
high oral exposures to faecal-contaminated water and foods, and/or 
with a eukaryotic component of their gut community that includes 
parasites, will have gut-associated immune systems with significantly 
different structural and functional configurations than those without 
these exposures. In this sense, including the term environmental 
together with enteropathy is logical and emphasizes the need to place a 
host’s immune and gut microbiome phenotypes in the context of their 
various exposures. 
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The representation and expression of microbiome genes in the 
gut communities of affected individuals compared with healthy 
controls may correlate with environmental enteropathy. Comparative 
metagenomic studies could thus provide important new diagnostic 
tools in the form of microbial taxa and microbiome gene functions. 
In addition, they could provide pathophysiological insight about 
relationships between host diet, enteropathogen representation in 
the microbiota, and microbiome gene composition and expression 
(including expressed metabolic functions). A major challenge will be 
to correlate this data with the results of quantitative phenotyping of 
the innate and adaptive immune systems of the human gut. This will 
require new and safe approaches for the sampling of immune system 
components, especially in the gut mucosa. Similarly, as noted earlier, 
the spatial relationships between members of the microbiota, as well 
as their proximity to elements of the gut-associated immune system in 
healthy individuals or in individuals with mucosal barrier dysfunction, 
is not well understood. 


Microbiota assembly and breast milk 

Breast milk is known to protect newborns from infection, in part 
because of the copious quantity of maternally generated antibodies that 
it contains. Although these antibodies have specificity for components 
of the microbiota, the microbial targets are not well defined for 
given maternal-infant dyads, or as a function of time after delivery. 
In addition to antibodies, breast milk contains other immunoactive 
compounds, including cytokines such as IL-10, growth factors such as 
epidermal growth factor and antimicrobial enzymes such as lysozyme. 
The effect of maternal nutritional status on the glycan, protein, lipid 
and cytokine landscape of breast milk needs to be defined further. 
This analysis should have a temporal axis that explores co-evolution 
of the immunological and nutrient properties of mother’s milk and 
the postnatal assembly and maturation of the infant gut microbiota 
and of the innate and adaptive immune systems. Important feedback 
systems may be revealed. Similarly, knowledge of the vaginal and 
cutaneous microbiota of mothers before and after birth, as a function 
of their nutritional status, could be very informative. For example, 
common configurations of microbial communities that occupy these 
body habitats could correlate with the development of environmental 
enteropathy in mothers and their offspring. 


Personalized gnotobiotics and culture collections 

As noted above, studies have demonstrated the ability of intestinal 
microbial communities to reshape themselves rapidly in response to 
changes in diet. These observations raise the question of whether and 
how malnourished states affect (1) the spatial/functional organization of 
the microbiota and the niches (professions) of its component members; 
(2) the capacity of the community to respond to changes in diet; (3) 
the ability of components of the microbiota to forage adaptively on 
host-derived mucosal substrates; and (4) the physical and functional 
interactions that occur between the changing microbial communities 
and the intestinal epithelial barrier (including its overlying mucus 
layer). One way of developing the experimental and computational 
tools and concepts needed to examine these challenging questions 
in humans is to turn to gnotobiotic mice that have been ‘humanized’ 
by the transplantation of gut communities from human donors with 
distinct physiological phenotypes, and to feed these mice diets that are 
representative of those of the microbiota donor. 


Personalized gnotobiotic mouse models 

We have used metagenomic methods to show that gut (faecal) 
communities can be efficiently transplanted into germ-free mice, 
and the mice can then be fed diets that resemble those consumed 
by the human microbiota donors, or diets with ingredients that are 
deliberately manipulated**’. Transplanted human gut microbial 
communities can be transmitted from gnotobiotic mothers to their 
pups. In principle, mice humanized with microbiota from individuals 
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residing in different regions of the world, and given diets that are 
representative of those cultural traditions, can provide proof-of- 
principle global ‘clinical trials’ of the nutritional value of foods and 
their effect on the microbiota and the immune system. 

Transplantation of a human faecal microbiota into germ-free mice 
can be viewed as capturing an individual’s microbial community at a 
moment in time and replicating it in several recipient gut ecosystems. 
The humanized mice can be followed over time under highly 
controlled conditions in which potentially confounding variables 
can be constrained in ways that are not achievable in human studies. 
This type of personalized gnotobiotics also provides an opportunity to 
determine the degree to which human phenotypes can be transmitted 
via the gut microbiota as a function of diet. Moreover, the documented 
responses of microbial lineages and genes encoding metabolic pathways 
in the transplanted, replicated communities may provide mechanistic 
insight into differences in the adaptations of healthy versus diseased gut 
microbiomes (and the host immune system) to changes in diets, plus 
new biomarkers of nutritional status and the effect of various therapeutic 
interventions, including those based on dietary manipulations. Putative 
microbial biomarkers obtained from studies of these mice can, in turn, 
be used to query data sets generated directly from the human donor(s). 

Despite the potential power of using humanized mice to study 
interactions between the host immune and metabolic systems and the 
intestinal microbiota under highly controlled conditions, this approach 
has caveats. Recent work on T};17 responses suggests that unlike the 
mouse microbiota, which contains SFB, a faecal microbiota from 
a human donor is not sufficient to drive the expression of immune- 
related genes in the small intestine of previously germ-free mice”. This 
raises the possibility that humanization may not fully recapitulate the 
capacity of a mouse microbiota to mature the intestinal immune system 
in mice. However, earlier studies on the effects of human microbiota on 
the mouse immune system showed that the ability of E. coli heat-labile 
enterotoxin to break oral tolerance to ovalbumin in germ-free mice can 
be inhibited by transplantation of either a human or a mouse microbiota 
during the neonatal period*’. Furthermore, a single component of a 
human gut symbiont, the polysaccharide A component of Bacteroides 
fragilis, can mature components of the CD4* T-cell response in mice”. 
Finally, we have observed a similar increase in the frequency of TCR-B* 
cells among lymphocytes in the mesenteric lymph nodes of gnotobiotic 
recipients of a human or mouse microbiota, compared with germ-free 
controls (P.P.A., V. K. Ridaura and J.I.G., unpublished observations). 
This suggests that although not all components of the mouse immune 
system will be matured by a human gut microbiota, the immune system 
is not likely to remain ignorant of these communities. In addition, any 
differences detected in direct comparisons of the effects of two different 
human gut communities may represent responses relevant to the human 
immune system. 


Personalized bacterial culture collections 

We have recently shown that the human faecal microbiota consists 
largely of bacteria that can readily be cultured’. Metagenomic 
analysis indicates that most of the predicted functions of a human’s 
microbiome are represented in its cultured members. In gnotobiotic 
mice, both complete and cultured communities have similar properties 
and responses to dietary manipulations. By changing the diet of the 
host, the community of cultured microbes can be shaped so that it 
becomes enriched for taxa suited to that diet. These culture collections 
of anaerobes can be clonally arrayed in multiwell formats: this means 
that personalized, taxonomically defined culture collections can 
be created from donors representing different human populations 
and physiological phenotypes in which the cultured microbes have 
co-evolved within a single human donor's gut habitat. 

Together, these advances yield a translational medicine pipeline 
for examining the interactions between food and food ingredients, 
the microbiota, the immune system and health. The goals for such 
a human translational medicine pipeline are to identify individuals 
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with interesting phenotypes, to assess the transmissibility of their 
phenotypes by human microbiota transplants into gnotobiotic 
animals, to select candidate disease-modifying taxa (retrieved 
from clonally arrayed, taxonomically defined personal bacterial 
culture collections), to sequence selected taxa and to reunite them 
in various combinations in gnotobiotic mice as defined model gut 
communities. The interactions of disease-modifying taxa with 
one another, their effects on host biology, and how these effects 
are influenced by diet can be further explored using methods such 
as high-throughput complementary DNA sequencing (RNA-Seq), 
mass-spectrometry-based proteomics and metabolomics, multilabel 
fluorescence in situ hybridization (for biogeographical studies of the 
microbiota), whole-genome transposon mutagenesis (to identify 
fitness factors for microbes under various dietary contexts*°), and 
immune profiling and other measurements of mucosal barrier 
function. Knowing the degree to which tractable bacterial taxa can 
influence host physiology, and how dietary components can be used 
to affect specific organisms in the microbiota*® in ways that provide 
benefit to the host may be useful for discovering new generations of 
probiotics and prebiotics. 


Looking ahead 

With massive prospective national surveys planned and being 
implemented — such as the National Institutes of Health National 
Children’s Study, which will follow a representative sample of 100,000 
children from before birth to age 21 — the time is right for an initiative 
to evaluate the relationships between our diets, nutritional status, 
microbiomes and immune systems. Many components could constitute 
this initiative. We can readily foresee several of these. 


Dietary databases 

As noted earlier, there is a need to create more and improved databases 
for monitoring changing patterns of food consumption, in which the 
surveillance efforts of several organizations can be integrated. This 
tool and other interdisciplinary approaches could be used to define 
a set of study populations that represents established and emerging 
food consumption patterns in distinct cultural and socio-economic 
settings. An emphasis could be placed on comparing humans living 
in Westernized societies with those living in developing countries 
that are undergoing marked transitions in lifestyles and cultural 
traditions. New, reliable, cost-effective and generalized methods will 
be needed to acquire quantitative data about the diets consumed by 
individuals in these study populations, and the resultant data will 
need to be deposited in searchable databases using defined annotation 
standards. Moreover, guidelines related to the ethical and legal aspects 
of human-subject research involving observational and interventional 
nutritional studies of pregnant women and their offspring need to be 
further developed. 


New biomarkers of nutritional status 

Readily procured human biospecimens could be used together with 
high-throughput, targeted and non-targeted (quantitative) profiling 
of metabolites in comprehensive time-series studies to define 
the relationship between diet, nutritional status and microbiome 
configuration in healthy individuals at various stages of life (for 
example, in women before, during and after pregnancy, and in 
their children during the first five years after birth). This could be 
accompanied by studies of malnourished individuals before, during 
and after well-justified, defined nutritional interventions. In addition to 
these data, genomes (genotypes), epigenomes and microbiomes could 
be characterized in these study cohorts together with a variety of clinical 
parameters (such as vaccine responses) and environmental parameters 
(such as water sanitation). The resultant data sets would be deposited in 
annotated searchable databases. A translational medicine pipeline that 
includes relevant cellular and animal models would help to guide the 
design and interpretation of these human studies. 
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Quantitative phenotyping of the immune system 

As noted earlier, a major challenge is to obtain cellular and molecular 
biomarkers for quantitative profiling of the innate and adaptive immune 
systems, including biomarkers of mucosa-associated barrier function. 
Given the small quantities of biomaterials available from certain body 
sites, this initiative should help to advance ‘miniaturizing technology’ 
for quantitative measurements of cells and biofluids. Non-invasive 
imaging-based biomarkers are also needed. 

Goals include identifying new host and microbial biomarkers and 
mediators of nutritional status, determining the nutritional value of 
various foods, and characterizing the function of the human adaptive 
and innate immune systems (including mucosal barrier integrity and 
mucosal immunity) and the dynamic operations of the microbiota. 
This information would be used for demonstration projects that 
rigorously define nutritional health and test preventive or therapeutic 
recommendations for micronutrient and macronutrient consumption, 
for example in pregnant women and infants/children, and their effect 
on the assembly and operations of the immune system. The microbiome 
component could also help to define a previously uncharacterized axis 
of human genetic evolution (our microbiome evolution), reflecting 
in part our changing dietary habits. It could also produce testable 
hypotheses about unappreciated aspects of the pathophysiology of 
Western diseases, and yield new microbiome-based strategies for 
disease prevention or treatment. m 
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Gene targeting in embryonic stem cells has become the principal technology for manipulation of the mouse genome, 
offering unrivalled accuracy in allele design and access to conditional mutagenesis. To bring these advantages to the wider 
research community, large-scale mouse knockout programmes are producing a permanent resource of targeted 
mutations in all protein-coding genes. Here we report the establishment of a high-throughput gene-targeting pipeline 
for the generation of reporter-tagged, conditional alleles. Computational allele design, 96-well modular vector 
construction and high-efficiency gene-targeting strategies have been combined to mutate genes on an unprecedented 
scale. So far, more than 12,000 vectors and 9,000 conditional targeted alleles have been produced in highly 
germline-competent C57BL/6N embryonic stem cells. High-throughput genome engineering highlighted by this study 
is broadly applicable to rat and human stem cells and provides a foundation for future genome-wide efforts aimed at 
deciphering the function of all genes encoded by the mammalian genome. 


Following the complete sequencing of the human and mouse genomes, 
the functional analysis of each of the twenty thousand or so protein- 
coding genes remains an important goal and a major technical chal- 
lenge. Several genome-wide mutagenesis strategies have been applied 
in the mouse, including ethyl-nitrosourea (ENU) mutagenesis, trans- 
poson mutagenesis, gene trapping and gene targeting. Gene trapping in 
mouse embryonic stem (ES) cells’” has been the most productive so far, 
providing hundreds of thousands of random insertional mutations in 
more than half of the protein-coding genes in the mouse*>. Notably, 
these ES cell resources can be archived indefinitely and are easily dis- 
tributed to the scientific community for the purpose of generating 
knockout mice. However, gene-trap alleles cannot be precisely engi- 
neered and the strategy favours genes expressed in mouse ES cells. 

Given the limitations of gene trapping, it is clear that the generation 
of a complete set of gene knockouts in the mouse will require the 
application of gene-targeting technology in ES cells®*. Gene targeting 
can be used to engineer virtually any alteration in the mammalian 
genome by homologous recombination in mouse ES cells, from point 
mutations to large chromosomal rearrangements”'’. Over the past 
20 years, gene targeting has been used to elucidate the function of 
more than 5,000 mammalian genes. Scaling this technology to the 
remainder of the genome presents numerous technical challenges 
and requires the production of targeted ES cells on an unprecedented 
scale, beyond the scope of conventional methodologies. 

The first targeting pipeline for ES cells was reported several years ago 
before the completion of the mouse genome sequence (Velocigene)"’. 
Bacterial artificial chromosome (BAC)-based targeting vectors were 
constructed to replace the coding sequence of the target gene with a 
lacZ reporter and promoter-driven selection cassette. Oligonucleotides 
required for the construction of targeting vectors by recombineering 
were based on cDNA sequences surrounding the translation initiation 


and termination signals of each target gene, thus requiring no previous 
knowledge of the underlying genomic structure of the gene. In a single 
recombineering step, modified BAC clones were generated with high 
efficiency and used to target genes in ES cells. Correctly targeted events, 
which involved the deletion of up to 70-kilobases (kb) of genomic 
sequence, were identified using a novel high-throughput allele- 
counting assay. The deletion of large regions of genomic sequence, 
although effective for eliminating the function of the target gene, can 
have unintended consequences on the regulation of adjacent and dis- 
tant transcriptional units’®”’. 

To support and accelerate progress towards the genetic analysis of 
all mammalian genes, large-scale knockout consortia were established 
in 2006 with the goal of generating a complete resource of reporter- 
tagged null mutations in C57BL/6 mouse ES cells'*. C57BL/6 is one of 
the best characterized inbred strains, is the reference strain for the 
mouse genome sequence and breed well in the laboratory. Thus, the 
study of mutant alleles in a pure C57BL/6 genetic background is 
considered to be ideal for large-scale phenotyping efforts that will 
follow. Highly germline-competent ES cell lines from the C57BL/ 
6N substrain of mice have been established for this project’*’”. A 
common web portal providing information and access to the resource 
has been established", with links to designated repositories for order- 
ing vectors, ES cell clones and mice. 

Here we describe a pipeline for the design and mass parallel con- 
struction of conditional targeting vectors by serial 96-well BAC 
recombineering and high-throughput gene targeting in C57BL/6 ES 
cells. Our pipeline is configured to create a number of useful resources 
en route to the generation of targeted ES cells (Supplementary Fig. 1). 
Ongoing large-scale production of targeted ES cell lines demonstrates 
rates of homologous recombination in C57BL/6 ES cells well above 
the historical average. Our pipeline forms the basis for the generation 
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of thousands of lacZ-tagged conditional alleles for the European 
Conditional Mouse Mutatgenesis (EUCOMM) and the National 
Institutes of Health Knockout Mouse (KOMP) programs as part of 
the international knockout effort'*. 


Computer -assisted design of alleles 


Conditional alleles permit the analysis of gene function in a tissue- 
specific or temporal manner during embryonic and postnatal develop- 
ment’®'*. Our conditional allele is based on the ‘knockout-first’ design”, 
a strategy that combines the advantages of both a reporter-tagged anda 
conditional mutation (Fig. 1 and Supplementary Fig. 2). In contrast to 
standard conditional designs, the initial unmodified allele is predicted 
to generate a null allele through splicing to a lacZ trapping element 
contained in the targeting cassette. Our trapping cassettes include the 
mouse En2 splice acceptor and the SV40 polyadenylation sequences, 
signals that have proven to be highly effective in creating null alleles in 
mice*”?. 

The knockout-first allele can be easily modified in ES cells or in 
crosses to transgenic FLP and cre mice. Conditional alleles are generated 
by removal of the gene-trap cassette by Flp recombinase, which reverts 
the mutation to wild type, leaving /oxP sites on either side of a critical 
exon. Subsequent exposure to Cre deletes the critical exon to induce a 
frameshift mutation and trigger nonsense-mediated decay of the 
mutant transcript. Many cre transgenic strains are available for the 
study of gene function in specific tissues and developmental time points 
(see http://www.creline.org). 

Typically, loxP sites are placed in introns of genes to avoid disrupt- 
ing normal transcription, processing and translation of the target 
gene. The loxP and FRT sites are positioned to minimize possible 
interference with the splice sites of the critical exon. In some cases, 
the presence of the recombinase sites may perturb normal splicing 
patterns”. This caveat notwithstanding, knockout first alleles are very 
useful for proving the causality of gene disruptions and observed 
phenotypes. Reversion of the phenotype with Flp, or conversely, 
induction of the phenotype with Cre, rule out potential effects of 
secondary linked mutations that can arise in cultured ES cells”. 
Furthermore, removal of the FRT-flanked stop cassette is particularly 
useful for further studies of genes that present heterozygous lethal 
phenotypes. 

The vector design process ideally begins with high-quality manual 
annotation of gene structures*. Manual annotation identifies and 
resolves errors in automated gene predictions and captures all known 
transcript variants from available messenger RNA evidence. However, 
FRT loxP 


FRT loxP loxP 


Figure 1 | Schematic of the ‘knockout-first’ conditional allele. The 
‘knockout-first’ allele (tm1a) contains an IRES:lacZ trapping cassette and a 
floxed promoter-driven neo cassette inserted into the intron of a gene, 
disrupting gene function. Flp converts the ‘knockout-first’ allele to a 
conditional allele (tmic), restoring gene activity. Cre deletes the promoter- 
driven selection cassette and floxed exon of the tmla allele to generate a lacZ- 
tagged allele (tm1b) or deletes the floxed exon of the tm1c allele to generate a 
frameshift mutation (tm1d), triggering nonsense mediated decay of the deleted 
transcript. 
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manual annotation of genes is a time-consuming process and proved 
rate-limiting in our high-throughput pipeline. Although the accuracy 
of automated gene prediction is improving, vector designs for Ensembl 
gene structures must be approached with caution. 

To assist in the design of conditional alleles, we developed a compu- 
tational tool to identify oligonucleotide sequences (50-mers) suitable 
for recombineering. These sequences are used to insert a selection 
cassette and loxP site around the critical exon and to recover homolog- 
ous sequence from the BAC required for gene targeting (Fig. 2a). More 
generally, these computational tools can be applied to any other mam- 
malian or non-mammalian genome for which the construction of large 
numbers of recombineered DNA constructs is desired. Each design is 
displayed on the genome browser (Fig. 2b) and manually inspected to 
choose the optimal design. Valid designs are selected for the 5’-most 
critical exon(s) that is common to all known transcript variants and 
disrupts at least 50% of the protein-coding sequence. Designs are 
rejected if the deleted region contains highly conserved intronic 
sequence as these elements are likely to correspond to regulatory ele- 
ments and complicate the interpretation of the mutant phenotype in 
mice’*??. 

Approximately 40% of protein-coding genes do not fit our design 
criteria, most commonly, small transcription units composed of one 
or two exons. Genes with alternative 5’ end transcripts are also prob- 
lematical. In some cases, it is not possible to remove a single exon or 
cluster of exons that disrupts all isoforms. These genes have been set 
aside for other partners within the international knockout consortium 
to generate standard lacZ-tagged deletion alleles using, for example, 
Velocigene technology". 


Construction of modular targeting vectors 


For the generation of conditional gene-targeting vectors, we developed a 
strategy for high-throughput, serial, liquid BAC recombineering in 96- 
well format (Fig. 3) similar to that reported for transgene production”. 
We adopted a modular strategy for the construction of targeting vectors 
using recombineering to create Gateway-adapted intermediate vectors 
(Fig. 4a) that are later assembled into the final targeting construct 
through in vitro Gateway reactions (Fig. 4b). For targeting in C57BL/ 
6N ES cells’®, we made use of indexed C57BL/6] BAC libraries’ for the 
construction of targeting vectors. 

The construction of Gateway-adapted intermediate targeting vec- 
tors from BACs involves three consecutive recombineering steps: 
insertion of an attR1/attR2 zeo-pheS Gateway element upstream of 
the critical exon (Fig. 3b and Supplementary Fig. 3); insertion of a 
floxed kanR cassette downstream of a critical exon (Fig. 3c); and sub- 
cloning of the modified region of genomic DNA (8-10kb) into a 
Gateway-adapted plasmid backbone by gap repair (Fig. 3d and Sup- 
plementary Fig. 3). Heterologous attR3/attR4 sites are included to 
enable switching of the plasmid backbone to introduce a negative 
selection cassette for positive-negative targeting in ES cells. The 
exquisite efficiency and nucleotide precision of Red operon-induced 
recombination in bacteria permitted the assembly of DNA constructs 
in 96-well format through three rounds of recombineering with an 80% 
overall efficiency (Supplementary Table 1). This efficiency of vector 
production readily accommodates the needs of the global mouse gene- 
targeting projects that aim to knock out thousands of genes per year’. 


Assembly of the final targeting constructs 


Gateway technology has been successfully used for the construction of 
large-scale genomic resources”. The use of Gateway technology 
minimizes the potential for deleterious mutations common to poly- 
merase chain reaction (PCR)-based cloning methods. We developed a 
series of promoterless and promoter-driven selection cassettes flanked 
by attL1/attL2 sites (Supplementary Fig. 4). To use positive-negative 
selection for gene targeting”, a plasmid backbone was constructed that 
contains attL3/attL4 Gateway elements and a diphtheria-toxin-A- 
chain*' (DTA) expression cassette. Final targeting constructs were 
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Figure 2 | Computational design of oligonucleotides for recombineering 
and LR-PCR genotyping. a, A critical exon(s) common to all transcript 
variants (red box) is identified. Recombineering oligonucleotides (50-mers) are 
identified by ArrayOligoSelector** within pre-defined blocks (G5, U, D, G3) of 
genomic sequence for insertions of the targeting cassette and 3’ loxP site and for 
plasmid rescue of the 5’ and 3’ homology arms by gap repair. For LR-PCR 


assembled in vitro in a three-part Gateway reaction (Fig. 4b) in 96-well 
format and sequence-confirmed across all recombineered junctions. 
Final targeting vectors were recovered from 95% of the intermediate 
plasmids (Supplementary Table 1). Thus, the overall efficiency of 
vector construction is 75% and, so far, we have constructed more than 
12,000 final targeting vectors. 


genotyping, multiple primers (25 to 30-mers) are then selected from 1-kb 
blocks of genomic sequence (GF, GR) outside the homology arms. b, Display of 
conditional alleles on the Ensembl genome browser (Distributed Annotation 
System (DAS) source = KO alleles). A conditional design for the merged 
Ensembl/Havana Rbmx gene on the reverse strand is shown. 


The intermediate vectors themselves (Fig. 4a) represent an important 
modular resource that can be re-used to generate alternative vector 
designs or additional mutant alleles in the future. For example, targeting 
cassettes containing specialized reporters, such as alkaline phosphatase 
or green fluorescent protein, can be rapidly assembled to provide 
alternative visualization of gene expression. Furthermore, targeting 
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Figure 3 | Construction of Gateway-adapted intermediate targeting vectors 
by 96-well BAC recombineering. Recombineering steps and elapsed time are 
shown. a, BAC clones, arrayed in 96-well format and electroporated with a 
plasmid expressing arabinose-inducible Red proteins (pBADgbaA)”. 

b-d, After arabinose induction, cells are electroporated with PCR fragments 
containing R1-pheS/zeo-R2 Gateway element (b), loxP-kan-loxP cassette 


Day 8 


(c) and R3-ori/ampR-R4 subcloning plasmid (d). e, After gap repair, plasmid 
DNA is prepared and transformed into Cre-expressing bacteria to remove the 
kank cassette, leaving a single loxP site downstream of the critical exon. 
Antibiotics used at each step are: A, ampicillin; C, chloramphenicol; K, 
kanamycin; T, tetracycline; Z, zeocin. 
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Figure 4 | Intermediate and final targeting constructs. a, Schematic showing 
the structure of the Gateway-adapted intermediate plasmid. A rare AsiSI 
restriction site is included in the gap repair plasmid for linearizing the final 
targeting vector before electroporation of ES cells. b, Assembly of final targeting 
vectors in a multi-Gateway reaction. See Supplementary Fig. 4 for a full description 
of the custom Gateway-adapted plasmids used for vector construction. 


vectors with different selectable markers can be readily constructed to 
knock out the second allele of genes for functional studies in homo- 
zygous ES cells. Finally, knock-ins of wild-type and mutant cDNAs 
provide an avenue for detailed structure-function studies or to explore 
human variation. Thus, a permanent library of intermediate targeting 
plasmids will permit the further exploitation of targeting technology in 
the future. 


High-throughput ES cell production 

To scale targeting experiments to high throughput, we optimized elec- 
troporation conditions for C57BL/6N ES cells'® in multi-well cuvettes. 
Here we aimed to minimize the number of cells and amount of plasmid 
DNA required to obtain sufficient drug-resistant colonies for screening 
(Table 1). After selection, expansion and freezing, most (65%) ES cell 
clones retained their ability to colonize the germ line of mice’®. 

Homologous recombinants generated with targeting vectors are 
usually identified by Southern blotting. However, this method is not 
practical for large-scale screening. Long-range PCR (LR-PCR) is an 
alternative method” which is better-suited to high-throughput geno- 
typing of ES cell clones. We developed a 384-well LR-PCR method to 
identify correctly targeted events (Fig. 5). PCR fragments, amplified 
with gene-specific primers outside the homology arms in combina- 
tion with primers in the targeting cassette, were sequence-verified. In 
general, LR-PCR was performed across the 3’ homology arm. Because 
the targeted clones are genotyped at one end, non-homologous events 
within the opposite arm will occur in rare cases. Furthermore, mixed 
clones composed of targeted and non-targeted cells are not detected 
by our high-throughput genotyping protocol. For these reasons, 
further validation of targeted alleles using standard Southern blot 
assays is highly recommended before use. 

Owing to frequent crossover events between the selectable marker 
and 3’ loxP site, many of the targeted ES cell clones lose the 3’ loxP site 
and cannot be converted to a conditional allele. To distinguish 
between these two alternative products of homologous recombina- 
tion, LR-PCR products amplified from the 3’ homology arm were 
sequenced with a primer at the loxP site. Where 3’ LR-PCR failed 


to generate a product, LR-PCR was performed across the 5’ homology 
arm (5’ LR-PCR). For these cases, the retention of the 3’ loxP site was 
confirmed by PCR between the cassette and 3’ loxP site. 


Gene targeting is highly efficient 

High-throughput gene targeting depends on achieving high targeting 
efficiencies. For genes expressed in ES cells, a promoterless targeting 
strategy (referred to as ‘targeted trapping’)** has been shown to yield 
targeting efficiencies averaging above 50%. By design, promoterless 
vectors effectively suppress the recovery of random non-homologous 
events in the genome as only insertions in transcribed loci, in the 
correct orientation and reading frame, will confer drug resistance. 
We electroporated 1,285 different promoterless constructs and 
obtained targeted clones from nearly half of these constructs with 
an average targeting efficiency of 50% (Table 1). These data confirm 
and extend the results of ref. 33, demonstrating that targeted trapping 
is a highly efficient method for genes expressed in ES cells. 

Only half of the promoterless targeting vectors were effective in 
producing targeted clones. Electroporation of these vectors produced 
variable numbers of drug-resistant colonies. In general, high colony 
numbers were predictive of successful targeting experiments, whereas 
low colony numbers usually indicated a failure to target the locus (Sup- 
plementary Table 2). The success or failure of a construct correlated 
with the number of clones with gene-trap events in the International 
Gene Trap Consortium database (Supplementary Table 3). Thus, gene- 
trapping data serve as a useful guide to identify the subset of genes that 
are amenable to a promoterless targeting strategy’. Correlation with 
classes of gene was also observed. For instance, targeted trapping was 
less effective with secreted proteins compared to non-secreted proteins, 
indicating that our cassette designed for trapping secreted proteins 
(pL1L2_ST, see Supplementary Fig. 4)*° is not optimal for this class 
of gene”. 

Given that only half of all genes are expressed at a sufficient level in 
ES cells to support a targeted trapping strategy, we switched to using a 
promoter-driven cassette for positive selection for non-expressed 
genes combined with negative DTA selection to select against random 
insertions. We electroporated different positive-negative targeting 
cassettes and from the analysis of approximately 30 ES cell clones 
per unique construct, we recovered targeted events for 80% of genes 
with an average targeting efficiency of 35% (Table 1; for a complete list 
of targeted genes see Supplementary Data). A combination of factors 
probably contribute to our high targeting efficiencies, including the 
use of isogenic DNA, relatively long recombineered homology arms 
and DTA negative selection. 

Gene targeting is dependent on both the length and the extent of 
homology between the targeting vector and the target locus*”*’. Our 
vectors typically contain 10 kb of homology to the endogenous locus 
and originate from a C57BL/6J BAC library. Although the ES cells are 
derived from the C57BL/6N sub-strain, the Jackson (J) and NIH (N) 
substrains of C57BL/6 are very closely related'’, thus our targeting 
vectors will have identical sequence with the ES cell genome in the 
great majority of cases. Negative selection was introduced to improve 
targeting efficiencies**’. Overall we observed a threefold enrichment 
of targeted clones with DTA counter-selection, consistent with pre- 
vious observations*”*?*° (Table 1). 

Ina high-throughput pipeline, projects inevitably fail at one or more 
steps and overall pipeline efficiency depends on effective recovery of 


Table 1 | Targeting efficiency using promoterless and promoter-driven cassettes 
Vector type Number of Number of Number of Number of Genes Targeting Number of Number of Number of targeted 
unique successful colonies* genes targeted targeted (%) efficiency (%) colonies targeted clones* clones with 

targeting vectors electroporations screened* 3' loxP site* 

Promoterless 1,285 778 224 621 48 51 24 12 6 

Promoter 1,811 1,671 348 1,440 80 35 29 10 35 

Promoter (—DTA) 87 87 729 49 56 12 34 4 1 

* Average values. 
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Figure 5 | Genotyping ES clones by LR-PCR sequencing. Five LR-PCR 
reactions are carried out: two 5’ arm (GF/5’U), two 3’ arm (3'U/GR) and one 
cassette (3’U/LX). Sequence verification of LR-PCR products is carried out 
with gene-specific primers (GF and GR) and with nested primers in the 


these failures. In our experience, most failures are technical in nature 
and are most efficiently recovered by repeating the procedure. 
For example, 70% of targeting experiments are rescued after re- 
electroporation of cells with an alternative preparation of vector 
DNA (Supplementary Data). Similarly, re-synthesis of oligonucleo- 
tides for recombineering or repeating the Gateway reaction recovers 
a majority of intermediate and final targeting vectors (data not shown). 
Thus, completion of the mutant resource will require iterative rounds 
of recovery. Whether some genes are refractory to targeting will 
become apparent once all technical issues have been ruled out. 


Discussion 

Our targeting pipeline is the major contributor to the international 
mouse knockout programmes that aim to generate JacZ-tagged null 
mutations in every protein-coding gene in mouse. With the technology 
described here, more than 9,000 genes have been successfully targeted in 
C57BL/6N ES cells to date. The value of our knockout ES cell resource 
critically depends on the germline potential of individual targeted 
C57BL/6N ES cell clones. In a separate study’®, hundreds of targeted 
cell lines generated in our pipeline were assessed for contribution to the 
germline after blastocyst injection. At least 65% of targeted clones colo- 
nized the germ line of chimaeric mice. Thus, our library of mutant 
C57BL/6N ES cells is robust and will support the production of mutant 
mice for future large-scale phenotyping programmes. 

The scale of mass parallel vector construction and gene targeting 
described here has implications for functional genomics and proteo- 
mics in many model systems. New systematic, genome-scale pro- 
grammes can now be contemplated. Using available BAC or fosmid 
genome resources, the high-throughput production of complex trans- 
genes and/or targeting constructs will facilitate the generation of 
sophisticated, physiologically accurate, cell and animal models. For 
example, tagging all proteins in the mouse genome by knock-in target- 
ing to establish a proteomic mapping programme equivalent to the 
highly successful yeast TAP-tagging programmes" is now feasible. 

In the coming years, it is likely that the genome engineering tech- 
nologies pioneered in the mouse will be also applicable to other model 
systems such as the rat***’ and human pluripotent stem cells***. The 
capacity for fluent gene targeting also permits the systematic genera- 
tion of doubly targeted ES cell lines for functional studies by con- 
ditional mutagenesis, which will serve to complement and extend 
RNA interference studies by providing complete genetic knockouts. 
Coupled with the power to differentiate ES cells into many cell types, 
such resources will not only provide means to gaining unique func- 
tional insights but will also reduce animal experimentation. With pio- 
neering methodologies, we have overcome the considerable technical 
challenges involved in establishing the most complex and accurate 
high-throughput functional genomics platform yet attempted. We 
believe that our work raises the standards of achievement and expecta- 
tion for future genome-scale programmes. 


targeting cassette (5'Us and 3’Us). To confirm the presence or absence of the 3’ 
loxP site, 3’ arm LR-PCR products are sequenced with a primer adjacent to the 
loxP site (LR). In cases where 3’ arm LR-PCR fails to generate a product, the 3’ 
loxP site is confirmed by sequencing the cassette product. 


METHODS SUMMARY 


Gene annotation and vector design software. Manual annotation of mouse gene 
structures was carried out as previously described**. Vector designs are based on the 
current release of the Ensembl and Vega databases (NCBIM37 assembly). Critical 
exon(s) for each target gene are identified computationally (start phase — end 
phase = 0). Using ArrayOligoSelector*’, our software returns a set of six 50-mer 
oligonucleotides at defined distances from the critical exon(s) for recombineering. 
96-well recombineering and three-way Gateway reactions. BAC clones are 
ordered from indexed C57BL/6] libraries” (RP23/24), arrayed in 96-well plates 
and transformed with pBADgbaA” plasmid encoding lambda Red recombina- 
tion proteins. Three rounds of recombineering are carried out serially in 96-well 
cultures using DNA cassettes amplified by PCR with primers containing 50- 
nucleotide homology to target sequences*”®. After gap repair, plasmid DNA is 
transformed into Cre-expressing bacteria to reduce the floxed kanR cassette to a 
single loxP site. 

Three-way Gateway reactions containing intermediate vector, attL1/attL2 tar- 

geting cassette and attL3/attL4 DTA plasmids are incubated with LR Clonase II 
Plus (Invitrogen), transformed into bacteria and selected on agar plates contain- 
ing appropriate antibiotics and 4-chlorophenylalanine*. Final targeting con- 
structs are sequence-verified across each recombineered junction, linearized 
with AsiSI and visualized on E-Gels (Invitrogen) to verify their size (Supplemen- 
tary Fig. 5). 
Electroporation of ES cells and LR-PCR genotyping. Electroporation of C57BL/ 
6N mouse ES cells'® with linearized plasmid DNA was carried out in 25-well 
electroporation cuvettes (BTX Harvard Apparatus). Stable clones were selected in 
medium containing Geneticin (Invitrogen). Typically 32 clones are picked, 
expanded in 96-well plates and archived in 96-well cryovials (Matrix). 

Long-range PCR reactions using SequalPrep (Life Technologies) or LongAMP 
(NEB) were carried out with genomic DNA from direct lysis of ES cells grown in 
96-well plates. PCR products were visualized on E-gels (Supplementary Fig. 6) 
then treated with exonuclease I and phosphatase (NEB) and sequenced. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Computational design of conditional alleles. Gene structures to be targeted are 
first extracted from a current release of the Ensembl (NCBIM37 assembly) or 
Vega database. Critical exons, which when deleted induce a frameshift, are chosen 
computationally (start phase — end phase # 0) or manually (exon length not 
divisible by 3). Primers (50-mer oligonucleotides) for recombineering are then 
selected from overlapping blocks of sequence (typically 120 bp) flanking the 
critical exons at a predefined distance from the splice sites (300 bp from the splice 
acceptor and 100 bp from the splice donor). Primers for gap repair were chosen 
from sequence blocks (typically 1 kb) at the ends of the desired homology arms 
(4-6 kb). Each block was analysed by ArrayOligoSelector*® (http://sourceforge. 
net/projects/arrayoligosel/) generating one or more candidate primers inside 
each sequence block with a minimum of 28% G+C content. Candidate primers 
were rejected if they were repetitive inside a region spanning 100 kb either side of 
the critical exon(s), and gap repair primers at the ends of the homology arms were 
also rejected if they shared sequences of 6 bp or more. The final recombineering 
primer sequences were mapped to the current NCBI assembly, recorded with 
their genomic coordinates in a database, and displayed in an Ensembl DAS-track. 
After manual inspection, complete sets of recombineering primers were selected 
from the database, automatically reverse complemented (where appropriate) and 
appended with 20-23 bp of sequence homology to the appropriate recombineer- 
ing cassettes before ordering. In parallel, BACs from the RP23/RP24 indexed 
library were chosen based on end-mappings of the clones. A vector design inter- 
face (Custom Design Tool; http://www.sanger.ac.uk/htgt) is available online. 
96-well recombineering. BACs from the RPCI-23/RPCI-24 indexed C57BL/6J 
libraries’ were arrayed in 96-well format to match the corresponding 96-well 
plates of 70-mer oligonucleotides (desalted; Ilumina/Invitrogen) used to PCR 
amplify the cassettes used for recombineering. PCR amplifications were per- 
formed using the FastStart High Fidelity PCR System (Roche) and the products 
were desalted using High Pure 96UF Cleanup kits (Roche). The arrayed BAC 
clones were initially grown at 37 °C in Luria broth (LB) containing chloramphenicol 
(12.5 pgml~') to early log phase and made electrocompetent by washing three 
times with ice-cold HPLC grade water and the cells are transformed with 
pBADgbaA plasmid DNA” using an ECM 630 96-well electroporator/HT-200 
automatic plate handler (BTX Harvard Apparatus; pulse conditions of 2,400 V, 
700 £2, 25 LF) followed by growth at 30 °C in liquid medium containing tetracycline 
(5 ug ml ') and chloramphenicol (12.5 pg ml '). The BAC cultures underwent 
three rounds of recombineering, changing only the PCR products used for each 
electroporation and the antibiotic selection applied after each step, using the fol- 
lowing standard procedure: early log phase cultures were induced to express the red 
operon following addition of 0.1% arabinose and incubated for 40 min at 37 °C; 
electrocompetent cells were electroporated in 96-well format (as above) with 1-2 j1g 
of desalted PCR products and allowed to recover at 37 °C for 90 min; an aliquot was 
then inoculated into a new 96-well box containing media plus the appropriate 
antibiotics and grown at 30°C for 2 days. The PCR cassette and antibiotic cocktail 
used at each step shown in Fig. 3 was as follows. (1) R1-pheS/zeo-R2, zeocin (4 ug 
ml’), tetracycline (5 ig ml‘), chloramphenicol (12.5 ig ml‘); (2) loxP-kan- 
loxP, kanamycin (15 ugml'), zeocin (6.5 ugml”'), tetracycline (5 ugml”’), 
chloramphenicol (12.5 pig ml !); and (3) pR3R4, zeocin (6.5 pig ml), kanamycin 
(15 pg ml *) carbenicillin (50 jg ml~'). After the gap repair step, the temperature 
was shifted to 37 °C to eliminate the recombineering plasmid. Intermediate plasmid 
DNA was purified using standard procedures from saturated cultures (1.5 ml) 
grown in 96-well blocks. Approximately 50 ng was transformed into electro- 
competent DH10B E. coli carrying the 705-Cre plasmid (Gene Bridges), pre- 
induced at 42 °C to express Cre recombinase from the Apr promoter, and selected 
in liquid culture containing carbenicillin (50 pg ml!) and zeocin (10 ig ml !). 
After overnight growth at 37 °C, individual colonies were streaked out on ampicil- 
lin/zeocin plates to isolate individual clones and were sequence-verified. 
Gateway exchange reaction. Three-way Gateway reactions were carried out in 
96-well format using LR Clonase II Plus enzyme mix (Invitrogen) essentially as 
described by the manufacturer. In an overnight reaction at 25 °C, 100-200 ng of 
intermediate targeting vector (prepared from 1.5-ml cultures in 96-well blocks 
using the Qiagen Turboprep kit) was combined with 60 ng of L1/L2 targeting 
cassette vector and 60 ng of L3/L4 DTA plasmid backbone in a 10 il volume. After 
treatment with Proteinase K, 2 ul of the reaction was transformed into 30 ul of 
chemically competent Escherichia coli (DH10B, Invitrogen) and plated onto YEG 
agar plates containing 4-chlorophenylalanine® and the appropriate antibiotics. 
Individual colonies were picked and sequenced across all recombineered junc- 
tions. Reads were automatically aligned against the synthetic vector sequences 
and assigned pass levels based on the number and position of matching reads. 
ES cell culture and electroporation. The final targeting constructs were prepared 
for ES cell electroporation from 2 ml of culture (2X LB plus antibiotics) in 96-well 
format using the Qiagen Turboprep kit. Before electroporation, vectors were 
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linearized with AsiSI and examined by gel electrophoresis. For most clones, the 
digested DNA migrated as a single high-molecular-mass band of the expected size 
(Supplementary Fig. 5). Occasionally, contaminating smaller molecular mass 
bands were also observed on the gel (DNA quality failures). 

JM8 mouse ES cell lines derived from the C57BL/6N strain were grown either 
ona feeder layer of SNL6/7 fibroblasts (neomycin and/or puromycin resistant) or 
on gelatinized tissue culture plates'®. Both feeder-independent and feeder- 
dependent lines were maintained in Knockout DMEM (500 ml, Gibco) supple- 
mented with 2mM glutamine, 5 ml 100X B-mercaptoethanol (360 pl in 500 ml 
PBS, filter sterilized), 10-15% fetal calf serum respectively (Invitrogen) and 
500 Uml * leukaemia-inhibitory factor (ESGRO, Millipore). Trypsin solution 
was prepared by adding 20 ml of 2.5% trypsin solution (Gibco) and 5 ml chicken 
serum (Gibco) to 500 ml filter-sterilized PBS containing 0.1 g EDTA (Sigma) and 
0.5 g D-glucose (Sigma). 

Electroporations of ES cells were carried out in a 25-well cuvette using the ECM 

630 96-well electroporator /HT-200 automatic plate handler (BTX Harvard 
Apparatus; set at 700 V, 400 Q, 25 \F). Immediately before electroporation, cell 
suspensions of ~1 X 10’ cells and ~2 1g of linearized targeting vector DNA were 
mixed in a final volume of 120 il PBS. Cells were seeded onto a 10-cm dish (with 
feeders or gelatin) and colonies were picked after 10d of selection in 100 pg 
(active) per ml Geneticin (Invitrogen). To expand cells into duplicate wells for 
archiving and preparation of genomic DNA, confluent cultures of JM8 ES cells 
grown on feeder cells were washed twice with pre-warmed PBS and trypsinized 
for 15 min at 37 °C. Five volumes of pre-warmed media were added and the cells 
were gently dispersed by tituration and passed at a dilution of 1:4 into new plates 
containing feeder cells. Passage of cells grown on gelatinized plates was carried 
out in a similar manner except that the cells were trypsinized for 10 min and 
passed at a dilution of 1:6 into freshly gelatin-coated plates (0.1% gelatin, Sigma 
G1393). Culture medium was replaced daily and cells reached confluence 2 days 
after passage. To archive ES cell clones, trypsinized cells from confluent 96-well 
plates were transferred in 200 ll freezing medium (Knockout DMEM, 15% serum/ 
10% DMSO) to 96-well cryovials (Matrix) and overlayed with sterile mineral oil. 
The cells were placed at — 80 °C overnight and then transferred to liquid nitrogen. 
Computational design of primers for long-range PCR. To identify targeted ES 
cell clones, we developed a robust LR-PCR system that uses one set of reaction 
conditions for every targeted allele screened. In addition, we used an in-house 
primer generation program (“Primer Brain”) to generate genome-specific pri- 
mers for the LR-PCR. Primers were selected from 2-kb blocks of sequence 
upstream of the 5’ homology arm (GF) and downstream of the 3’ homology 
arm (GR) and from a variable-sized region that contains the critical exon (EX). 
Primers were first extracted by a single-base-pair tiling of each region into 24- to 
30-mers that end in G/C, have at least 10 G/C bases and have a melting temper- 
ature of at least 64 °C. Primer choice was weighted negatively to avoid both ‘runs’ 
of nucleotides (for example, “AAA’) and self-annealing ends. The top 100 high- 
scoring primers in each region were aligned against the current mouse genome 
(NCBIM37) with Exonerate software (http://www.ebi.ac.uk/~guy/exonerate) 
and were weighted negatively based on the number of alignments to the genome, 
with added negative weight given to alignments close to the 3’ end of primers. The 
two best-scoring primers from each block (GF1 and GF2; GR1 and GR2; EX1 and 
EX2) were grouped and primer combinations (for example, GF1 and EX1) were 
screened to eliminate pairs with a 4-bp overlap at their 3’ ends. The resulting GF, 
GR and EX primers were stored in an Oracle database. 
LR-PCR genotyping. ES cell genomic DNA was isolated by digesting the cells 
with Proteinase K and RNase A. Each well of a confluent 96-well plate was lysed 
with 30 pl of lysis buffer (10 mM Tris/HCl ph 8, 1 mM EDTA, 50 mM KCl, 2mM 
MgCl.) containing 200 pg ml * RNase A (Sigma) and 0.67 mg ml * proteinase K 
(Life Technologies). After overnight digestion at 60 °C, the samples were heated 
to 90 °C (2 min) and 1-2 ul of the lysate was used in a 10 pt] LR-PCR reaction. To 
generate LR-PCR amplicons, two genomic-specific primers outside each end of 
the 5’ and 3’ homology arms (GF and GR, respectively) were used in combination 
with the appropriate universal cassette primers (5U (5’-CACAACGGGTTC 
TTCTGTTAGTCC-3’) and 3U (5'- ATCCGGGGGTACCGCGTCGAG-3’)) 
(Fig. 5). 

Using the SequalPrep kit (0.1 j11 100% v/v DMSO, 0.5 il 10 enhancer A, 0.5 pil 
10X enhancer B, 1.0 1 10X buffer, 0.2 p11 Taq Enzyme/dNTPs; Life Technologies) 
or LongAMP Taq mix (0.2 1 100% v/v DMSO (Sigma), 0.3 pl 10 mM dNTPs 
(Thermo Fisher Scientific), 2.0 ul 5x LongAMP buffer (NEB), 0.4 ul LongAMP 
Taq (NEB)), 10 pil reactions were set up in 384-well format with ~30-50 ng (1- 
2 ul) genomic DNA and 12 pmol of each primer. Thermal cycling was performed 
using the following conditions: 1 cycle 93 °C for 3 min; 8 cycles 92 °C for 15 s, 65 °C 
for 30s decreasing by 1 °C per cycle, 65 °C (LongAMP) or 68 °C (SequalPrep) for 
8 min; 30 cycles 92°C for 15s, 55°C for 30 s, 65°C (LongAMP) or 68°C 
(SequalPrep) for 8 min increasing 20s per cycle; 1 cycle 65°C (LongAMP) or 
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68 °C (SequalPrep) for 9 min. The PCR products were visualized on 1% E-gels (Life 
Technologies) and scored for the presence of high-molecular-mass fragments 
(Supplementary Fig. 6). The LR-PCR products were treated with exonuclease I 
and shrimp alkaline phosphatase (0.3 U pl’ and 0.19 U pl’, respectively; NEB) 
in 20 mM Tris/HCl, 10 mM MgCl, for 1h at 37 °C followed by 80°C for 15 min. 
PCR products were sequenced with the genomic primers used for amplification 
and universal primers to the targeting cassette (5'Us (5'-CGTGGTATCGT 


TATGCGCCT-3’) and 3'Us (5'-TCTATAGTCGCAGTAGGCGG-3’)) and 3’ 
loxP (LR (5'-ACTGATGGCGAGCTCAGACC-3’)). Sequence reads were com- 
pared by BLAST against synthetic sequences for each targeted allele and clones 
with correctly aligned sequences were marked as valid. Clones that retained the 3’ 
loxP site and have 3’ or 5’ sequence-verified LR-PCR bands are marked for dis- 
tribution and clones that have lost the 3’ loxP are marked as targeted, non- 
conditional events. 
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Latent TGF-6 structure and activation 


Minlong Shi’, Jianghai Zhu', Rui Wang', Xing Chen’, Lizhi Mi’, Thomas Walz” & Timothy A. Springer’ 


Transforming growth factor (TGF)-f is stored in the extracellular matrix as a latent complex with its prodomain. 
Activation of TGF-B1 requires the binding of a, integrin to an RGD sequence in the prodomain and exertion of force 
on this domain, which is held in the extracellular matrix by latent TGF-f binding proteins. Crystals of dimeric porcine 
proTGF-f1 reveal a ring-shaped complex, a novel fold for the prodomain, and show how the prodomain shields the 
growth factor from recognition by receptors and alters its conformation. Complex formation between af, integrin and 
the prodomain is insufficient for TGF-f1 release. Force-dependent activation requires unfastening of a ‘straitjacket’ that 
encircles each growth-factor monomer at a position that can be locked by a disulphide bond. Sequences of all 33 TGF-B 
family members indicate a similar prodomain fold. The structure provides insights into the regulation of a family of 
growth and differentiation factors of fundamental importance in morphogenesis and homeostasis. 


The TGF-B family is key to specifying the body plan during metazoan 
development'*. Members of this family, including nodal, activins, 
inhibins, bone morphogenetic proteins (BMPs) and growth differ- 
entiation factors (GDFs), specify the anterior/posterior and dorsal/ 
ventral axes, endoderm, mesoderm and ectoderm, left-right asym- 
metry and details of individual organs. TGF-B1, TGF-B2 and TGF-B3 
are important in development, wound healing, immune responses 
and tumour-cell growth and inhibition’’. 

Although TGF-B synthesis and expression of its receptors are wide- 
spread, activation is localized to sites where TGF-B is released from 
latency. TGF-B family members are synthesized with large amino- 
terminal prodomains, which are required for the proper folding and 
dimerization of the carboxy-terminal growth-factor domain*. Despite 
intracellular cleavage by furin, after secretion, noncovalent asso- 
ciation persists between the dimeric growth-factor domain and pro- 
domain of TGF-f, and of an increasingly recognized number of other 
family members. The prodomain is sufficient to confer latency on 
some family members and it also targets many members for storage 
in the extracellular matrix, in complex with latent TGF binding 
proteins (LTBPs) or fibrillins®*. 

The prodomains of TGF-81 and TGF-f3 contain an RGD motif 
that is recognized by a, integrins. Mice with the integrin-binding 
RGD motif mutated to RGE recapitulate all major phenotypes of 
TGF-B1-null mice, including multi-organ inflammation and defects 
in vasculogenesis, thus demonstrating the essential role of integrins in 
TGF-B activation’. Among ©, integrins, the phenotypes of integrin 
B.-null and integrin Bg-null mice demonstrate the particular import- 
ance of the a, and «Bg integrins for activation of TGF-B1 and TGF- 
63 in vivo*’. 

Integrin binding alone is not sufficient for TGF-B activation. 
Activation by %,B¢ integrin requires incorporation of TGF-B1 into the 
extracellular matrix, by association with LTBP, and association of the Bg 
cytoplasmic domain with the actin cytoskeleton*”°”’. Furthermore, con- 
tractile force is necessary for TGF-f activation by myofibroblasts*. Thus, 
tensile force exerted by integrins across the LTBP-prodomain-TGF-B 
complex is hypothesized to change the conformation of the prodomain 
and to free TGF-B for receptor binding”®. Here, we describe the struc- 
ture of latent TGF-B, mechanisms for latency and integrin-dependent 
activation, and broad implications for the regulation of bioactivity in the 
TGF-B family. 


Crystal structure 

The structure of pro-TGF-B1 at 3.05 A (Fig. la-c and Supplementary 
Table 1) was solved using multi- and single-wavelength anomalous 
diffraction. Electron density maps (Supplementary Fig. 1) were 
improved by multi-crystal, multi-domain averaging over four mono- 
mers per asymmetric unit. In a ring-like shape, two prodomain arm 
domains connect at the elbows to crossed ‘forearms’ formed by the 
two growth-factor monomers and by prodomain ‘straitjacket’ ele- 
ments that surround each growth-factor monomer (Fig. la—c and 
4a). The centre of the ring contains solvent. The arms come together 
at the neck, where they are disulphide-linked in a bowtie, and RGD 
motifs locate to each shoulder (Fig. 1a). On the opposite side of the 
ring where the straitjacketed forearms cross, LTBP would be linked to 
straitjacket residue Cys 4, which is mutated to serine in the crystal- 
lization construct (Fig. la-c). 

The arm domain, residues 46-242, has a novel fold’? with unusual 
features. Its two anti-parallel, four-stranded -sheets bear extensive 
hydrophobic faces but these overlap only partially in the hydrophobic 
core (Fig. la, e). The hydrophobic faces are extended by long meanders 
between the two sheets and burial by the «2, «3 and «4 helices. 

B-strands B8 and £9 extend on the two-fold pseudo-symmetry axis to 
link the two arm domains in a bowtie at the neck (Fig. 1a). The bow is 
tied with reciprocal inter-prodomain disulphide bonds, Cys 194- 
Cys 196 and Cys 196—-Cys 194, and by hydrophobic residues (Fig. 1a, e). 

Arg 215 of the RGD motif locates to a disordered loop (residues 
209-215) following the bowtie B9 strand. Partially ordered Gly 216 
and Asp 217 of the RGD motif (Fig. la, d) begin the long, 12-residue 
meander across the hydrophobic face of the neck-proximal B-sheet 
that connects to B10 in the forearm-proximal f-sheet. 

The straitjacket, residues 1-45, is formed by the «1 helix and the 
latency lasso (Fig. la~c and 2a). The latency lasso, an extended loop that 
connects the «1 and «2 helices, has little contact with the remainder of 
the prodomain while encircling the tip of each TGF-8 monomer 
(Fig. 1a, b, f). Six proline residues and three aliphatic residues make 
hydrophobic contacts with an extensive array of growth-factor aro- 
matic and aliphatic residues, and help these to stabilize the conforma- 
tion of the latency lasso (Fig. 1f). 

A highly hydrophobic face of the amphipathic «1 helix, bearing 
isoleucine and leucine residues, interacts with Trp 279 and Trp 281 
and with aliphatic side chains on one growth-factor monomer (Fig. lf, 
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Figure 1 | Architecture of proTGF-B1. Arm, straitjacket and TGF-B1 
monomer segments are coloured differently. a, b, Overall structure. Spheres 
mark the last residue visible in density in the prodomain and the first residue of 
the growth factor. Disordered segments are dashed. Red arrows show the 
directions of forces during activation by integrins. Key side chains are shown in 
stick representation, including Asp of the RGD motif in cyan and CED 
mutations in white. Disulphide bonds and the Cys 4 mutation to Ser are in 
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Fastener 


Unfastened straitjacket 


yellow. c, Schematic of the structure and activation mechanism. SS, disulphide 
bonds. d, Hydrophobic residues near Asp 217 of the RGD motif. e, Arm 
domain. Side chains for the hydrophobic core are shown in gold (also marked 
in Fig. 2 and Supplementary Fig. 2), conserved «2-helix residues that interact 
with the growth factor are in pink, fastener residues are in silver and bowtie 
residues are in light green for aliphatics and yellow for Cys. f-h, Straitjacket and 
fastener details. 
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g). The tryptophan residues are further covered by lasso residues 
Leu 30, Pro 33 and Pro 34 (Fig. 1f). Notably, the «1 helix is buried 
deeply in an interface between the two growth-factor monomers 
(Fig. 1a, b, g). The interface on the second monomer includes three 
tyrosine residues (Fig. 1g). 

Together with the straitjacket, the arm domain completes the en- 
circlement of each growth-factor monomer. The «2 helix buries 
Val 338 and Val 347 of the TGF-B finger (Fig. 1f). The «5 helix pro- 
jects from the base of the arm domain (Fig. la and e) and in it, 
prodomain residue Arg 238 forms a salt bridge to Glu 348 in TGF- 
B1 (Fig. 1f, h). These two residues are invariant in proTGF-B1, 
proTGF-B2 and proTGF- 3 (Supplementary Fig. 2). 

The straitjacket is fastened to arm residues 74-76 (Fig. la, h). A 
backbone hydrogen bond between the nitrogen of Ala76 and the 
oxygen of Lys 27 caps the C-terminal end of the «1 helix (Fig. 1h). 
Moreover, the carbonyl oxygen of Ala 76 forms a hydrogen bond to 
Arg 238 in the «5 helix (Fig. 1h). Lys 27 is a key fastener residue. Its 
side chain forms a m-cation bond to the side chain of Tyr 74, a hydro- 
gen bond to the backbone of Tyr 74, and hydrogen bonds to the 
backbone and sidechain of Ser 351 (Fig. 1h). Van der Waals contacts 
between the bulky side chains of the fastener residues Lys 27, Tyr 74 


ARTICLE 


and Tyr 75 also secure the straitjacket. Notably, Lys 27, Tyr 74 and 
Tyr75 are invariant among TGF-B1, TGF-B2 and TGF-B3 (Sup- 
plementary Fig. 2). Fastening is reinforced by backbone hydrogen 
bonds between arm f1-strand residues 77 and 78 and growth-factor 
B-fingers, which join the prodomain and the growth factor in a super- 
B-sheet (Fig. 1a). 

The TGF-B dimer forms the forearms, although TGF-B monomers 
have also been described as hand-like'*"* (Fig. 3). Each monomer has 
no hydrophobic core, aside from the cystine knot motif in which one 
disulphide passes between two polypeptide segments bridged by two 
other disulphides. 

Prodomain-bound TGF-1 differs markedly from previous TGF-B 
structures in both the orientation between monomers and the posi- 
tion of elements within monomers (Fig. 3 and Supplementary Fig. 3). 
The Cx root-mean-squared deviations over all 112 residues are 7A, 
and 2 A over the most similar 85 residues. The largest differences are 
imposed by the prodomain «1 helix. It occupies a similar position to 
the growth-factor «3 helix in mature TGF-B1 (Supplementary Fig. 3). 
Intercalation of the prodomain «1 helix between the growth-factor 
monomers reduces the total area buried between monomers from 
850 A* to 335 A”. The large conformational changes in TGF-1 are 
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Figure 2 | The TGF-f family. a, Sequence alignment of five representative 
prodomains. Orange circles mark the core hydrophobic residues shown in 
Fig. le. Inh-«, inhibin-o; Myost, myostatin. Black dots over TGF-B1 sequence 


Inhibin-BA 


Inhibin-BB 


Lefty1 
Lefty2 


Anti-Miullerian 
hormone 


Inhibin-o. 


mark decadal residues. Vertical dashed lines mark cleavage sites. 
b, Phylogenetic tree of the TGF-f family, based on the alignment in 
Supplementary Fig. 2. 
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Figure 3 | Shielding from receptor binding. ProTGF-B1 and TGF-B1 in 
complex with its receptors (R type I and II) (ref. 15) were superimposed on the 
TGF-B dimers. For clarity, only one monomer of each is shown. The receptors 
are shown as transparent molecular surfaces. Elements of the prodomain that 
clash with the receptors are labelled. 


driven by an intimate interaction between the growth-factor and 
prodomain dimers, which buries a total area of 2,440 A 


Implications for biosynthesis 


Folding and secretion of active TGF-B1 and activin A requires the co- 
expression of their prodomains’*, whereas the TGF-B1 prodomain can 
be biosynthesized in the absence of the growth-factor domain’®. These 
findings suggest that the C-terminal growth-factor domain folds either 
concomitantly with, or subsequently to, the N-terminal prodomain. 
The kinetics of biosynthesis of TGF-B1, activin and anti-Millerian 
hormone are slow and for anti-Miillerian hormone, folding of the 
growth-factor domain is rate-limiting’’. Regions of the prodomain that 
may be particularly important in templating the folding of the growth- 
factor domain include the B1 strand that forms a supersheet with the 
TGFE-f fingers and the «1 and «2 helices, which pack against extensive 
hydrophobic interfaces on opposite sides of the growth-factor fingers 
(Fig. la, f, g). Residues Ile 17, le 24, Leu 25 and Leu 28 in the «1 -helix 
interface, and Leu 30 in the lasso interface (Fig. 1f, g), have been spe- 
cifically identified as important for TGF-B1 association’*. The embrace 
of the fingers of each growth-factor monomer may complement the 
correct formation of the cystine knot and inter-monomer disulphide 
bonds in TGF-B. The structure of proTGF-$1 makes these disulphides 
accessible to disulphide isomerases during biosynthesis (Fig. 1b). 

A definitive assignment of which growth-factor and prodomain 
monomers derive from the same polypeptide chain is not possible 
because of intracellular cleavage by furin and lack of density for resi- 
dues 243-249. However, cleavage is incomplete and the small amount 
of uncleaved proTGF-f that is present in protein preparations co- 
crystallizes with cleaved proTGF-B (Supplementary Fig. 1d), indi- 
cating that there is no major conformational change after cleavage. 
A long prodomain-growth-factor connection through the centre of 
the ring, spanning ~50 A, would require substantial conformational 
change and would limit access to furin. Therefore, we have assigned the 
shorter ~30 A connection, between the C terminus of the prodomain 
and the N terminus of the growth factor, located on the same side of the 
ring (for example, the magenta and gold spheres in Fig. 1a, b). 
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Figure 4 | ProTGF-B1 complexes with LTBP and a, integrin, and 
activation of TGF-B. a-d, Representative negative-stain electron microscopy 
class averages of proTGF-f (a), the complex of proTGF-f with a fragment of 
LTBP1 containing TGF-binding domain 3-EGF-EGF-TGF-binding domain 4 
(b) and complexes of proTGF-f1 with «,B, integrin, prepared with an excess of 
proTGF-B1 (c) or an excess of af, integrin (d). Scale bars, 100 A. e, Non- 
reducing SDS-PAGE of the complex peak from $200 gel filtration used for 
electron microscopy in d (lane 1), a, integrin (lane 2) and proTGF-B (lane 3). 
LAP, latency-associated protein (prodomain). f, Activation of TGF-B1. 293T 
cells stably transfected with «,B¢ integrin or a mock control were additionally 
transfected with the indicated wild-type (WT) or mutant proTGF-B1 
constructs, or with empty vector (mock), and co-cultured with TGF-f indicator 
cells’. g, Material made by the indicated mutants in 293T cells was heated at 
80 °C and assayed with indicator cells. Error bars in f and g show the s.e.m. of 
3-9 samples from 1-3 representative experiments. h, i, Western blots of 
proTGF-B1 secreted by the indicated transfectants, using an antibody to the 
prodomain (h) or streptavidin to detect biotinylated cysteines (i). 


This assignment indicates a swap in the growth-factor monomer 
that each prodomain monomer embraces, with the intimate interac- 
tions described above occurring between prodomains and growth- 
factor domains that are present on different polypeptide chains in 
the precursor in the endoplasmic reticulum (for example, the gold 
and green domains in Fig. 1). Thus, the surface area buried between 
the growth-factor domains and prodomains of different precursor 
monomers (900 A”) is substantially larger than that within the same 
precursor monomer (370 A? ) and adds substantially to the inter- 
monomer interfaces of the growth factor (330 A’) and prodomain 
(600 A). Swapping may be important in regulating the proportion of 
growth-factor heterodimers, which have unique functions in settings 
such as dorsoventral patterning”. 


Complex with LTBP and activation 


In the large latent complex, a single LTBP molecule is disulphide- 
bonded to two proTGF-B monomers. We confirmed this unusual 1:2 
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stoichiometry by multi-angle light-scattering mass measurements of 
the complex with an LTBP fragment. The LTBP-crosslinking Cys 4 
residues in each monomer (serines in our construct) are 40 A apart 
(Fig. 1b). Their linkage to LTBP will reinforce the straitjacket by 
fastening together the forearms (Fig. 1c). The large 40 A separation 
may represent a mechanism for preventing disulphide linkage 
between the two Cys 4 residues. In contrast, the two cognate cysteines 
in TGF-binding domain 3 of LTBP become linked to one another in the 
absence of complex formation. These cysteines are surface-exposed 
and surrounded by acidic residues that interact with proTGF-B (refs 
20, 21). In agreement with this, the straitjacket «1-helices bear basic 
residues (Fig. 1b). Moreover, between the two prodomain «1-helices, a 
concave growth-factor surface that bears numerous hydrophobic resi- 
dues, including proline and disulphide-linked cysteine, is available for 
interaction with LTBP (Fig. 1b). 

Negative-stain electron microscopy class averages of proTGF-f are 
in excellent agreement with our crystal structure (Fig. 4a and Sup- 
plementary Fig. 4). A complex with an LTBP fragment that contains 
TGF-B-binding domains 3 and 4 and two intervening EGF domains 
shows no major conformational change in the proTGF-B moiety 
(Fig. 4b). An additional density corresponding to the LTBP fragment 
is present on the periphery of the ring, as expected from our crystal 
structure, and causes the ring to lie at an angle to the substrate in some 
class averages (Fig. 4b, middle panel and Supplementary Fig. 4c). 

The RGD motifs in the shoulders of proTGF-f are highly accessible 
for integrin binding (Fig. 1a). In contrast to many RGD motifs that are 
present in extended loops, the position of Asp 217 is stabilized by 
burial of Leu218 and by a 218-131 backbone hydrogen bond 
(Fig. 1d). Exposed hydrophobic side chains that are nearby on the 
body of the arm domain (Fig. 1d) may increase affinity for integrins. 

The integrin «8, ectodomain, with a C-terminal clasp, formed 
complexes with proTGF-B1 in Ca”* and Mg’* that could be isolated 
by gel filtration, demonstrating unusually tight binding for an integrin 
(Fig. 4c-e and Supplementary Fig. 4a). Different input ratios of 
proTGF-B1 and «,B, integrin yielded 1:1 (Fig. 4c) and 1:2 (Fig. 4d, 
e) complexes. Binding to ligand stabilized extension of the integrin 
legs and the open conformation of the «Bs headpiece” (Fig. 4c, d and 
Supplementary Fig. 4d, e). Integrins were bound to proTGF-B at the 
interface between a large density, corresponding to the «, $-propeller 
domain, and a small density, corresponding to the 8. BI domain, with 
their legs extending away from proTGF-f. The spacing between the 
binding sites on the ring seen by electron microscopy (40-50 A) was 
appropriate for that between the two RGDs in the crystal structure 
(45 A). No major conformational change in proTGF-f 1 was apparent, 
even with two integrins bound (Fig. 4d), and SDS—polyacrylamide gel 
electrophoresis (SDS-PAGE) confirmed the presence of TGF-B1 in 
the complex (Fig. 4e). These biochemical and structural studies dem- 
onstrate that integrin binding to proTGF-f1 is not sufficient for 
release of TGF-f1, consistent with previous cell-biological assays**"°. 
The requirements of (1) attachment of proTGF-f through LTBP to 
the extracellular matrix; (2) integrin attachment to the cytoskeleton 
and (3) cellular contraction indicate that the generation of tensile 
force across proTGF-f1 is required for activation of TGF-B**'°"’. 

The structure enables the overall mechanism of TGF-f1 activation 
by applied force to be readily predicted (Fig. 1c). Tensile force applied 
to the RGD motifs in the shoulders by «, integrins attached to the 
actin cytoskeleton will be resisted at the opposite end of the ring, 
where the Cys 4 residues in the straitjacket are disulphide-linked to 
LTBP, which is tightly associated with the extracellular matrix. Pulling 
force will be applied in the directions shown by red arrows in Fig. la. 

The direction of the pulling force and fold topology strongly influ- 
ence the unfolding pathway and resistance to force’. B-Sheet proteins 
are the most force-resistant and thus the arm domain will be the most 
force-resistant portion of the prodomain. Pulling against the RGD 
motif will be transmitted through the long meander to the B10 strand. 
Force transmitted from the Cys 4 residues through the straitjacket will 


ARTICLE 


be resisted by the $1 strand. The B1 and B10 strands are each parallel 
to applied force and adjacent in a f-sheet (Fig. 1) and are thus in the 
most force-resistant structural geometry known, the hydrogen-bond 
clamp*’. By contrast, the topologies and geometries of the o-helices 
and the long latency lasso of the straitjacket are ill-suited to resist 
force. Force on Cys 4 will apply leverage to the C-terminal end of 
the «1 helix and weaken interactions with fastener residues. After 
unfastening, the long latency lasso, which has no stabilizing hydro- 
gen-bond interactions, will be easily elongated and straightened by the 
applied tensile force. Thus, freed by opening of its straitjacket, TGF-B 
will be released from the prodomain and activated for receptor bind- 
ing (Fig. Ic). 

The prodomain not only holds TGF-B in a markedly different 
conformation from when it is free or bound to receptors, it also blocks 
receptor access completely. TGF-B family members are recognized by 
two type I receptors and two type II receptors that surround the 
growth-factor dimer (Supplementary Fig. 3e)'°. Binding of the type 
II receptor to the finger-tips of the growth factor is blocked by the 
latency lasso, and binding of the type I receptor to the body of the 
growth-factor domain is blocked by the prodomain «1 helix, «5 helix, 
the fastener and the ends of B-strands 1, 3 and 10 (Fig. 3). Although 
straitjacket removal might be sufficient to allow binding of type II 
receptors, type I receptor interactions overlap with so many interac- 
tions between TGF-8 and the arm domain (Fig. 3) that complete 
release from the prodomain would be required for receptor binding. 
The structure thus shows that integrins could not expose TGF-B 
sufficiently for receptor activation if it remained bound to the prodo- 
main, and that other explanations should be sought for the greater 
activity of integrin-activated TGF-B on neighbouring cells than on 
distant cells’®. 

To test the importance of unfastening in TGF-f activation, key 
residues were mutated. Non-conservative substitutions of the fastener 
residues Tyr 74 and Tyr75 resulted in spontaneous, non-integrin- 
dependent TGF-B activation (Fig. 4f). Among the different amino 
acids to which Tyr 74 and Tyr 75 were mutated, only phenylalanine 
was not activating. As a control, mutation of nearby Leu 28, in a 
hydrophobic interface with TGF-B, was not activating (Fig. 4f). 
These results are consistent with the importance to fastening of 
m-bonding and of van der Waals interactions of the aromatic tyrosine 
side chains. 

In the fastener, the Ca carbons of Lys 27 and Tyr 75 are only 4.1 A 
apart, permissive for disulphide bond formation in a K27C/Y75C 
mutant, as confirmed by free-cysteine labelling of the Y75C mutant 
but not the K27C/Y75C mutant (Fig. 4h, i). The K27C mutation 
greatly reduced expression (Fig. 4h). Similarly, a K27A mutation 
greatly reduces expression, and also releases free TGF-f1 (ref. 18). 
The Y75C mutant was constitutively active (Fig. 4f). The K27C/Y75C 
double mutation rescued expression compared to K27C, prevented 
the spontaneous release of TGF-f that was seen with Y75C and, com- 
pared to wild type, made proTGF-f completely resistant to integrin- 
Oy ¢-dependent activation (Fig. 4f, h). Denaturants such as heat can 
unfold proteins by pathways distinct from applied force”’. Heat released 
comparable amounts of active TGF-B from wild type and the K27C/ 
Y75C mutant (Fig. 4g). Thus, a disulphide bond can fasten the strait- 
jacket permanently and prevent integrin-dependent activation. These 
results support the hypothesis that tensile force applied to the prodo- 
main by integrins can release TGF-B, and emphasise the importance of 
straitjacket unfastening in integrin-dependent activation. 


Mutations in disease 

Camurati-Engelmann disease (CED), which is characterized by 
thickening of the shafts of the long bones with pain in muscle and 
bone, is caused by mutations in the prodomain of TGF-f1 that 
increase its release'***. Among CED mutations, Y52H disrupts an 
o2-helix residue that cradles the TGF-f fingers (Fig. la, f). The 
charge-reversal E140K and H193D mutations disrupt a pH-regulated 
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salt bridge between Glu 140 and His 193 in the dimerization interface 
of the prodomain (Fig. 1a). ‘Hotspot’ residue Arg 189 is substantially 
buried: it forms a cation-7 bond with Tyr 142 and salt bridges across 
the dimer interface with bowtie residue Asp 197 (Fig. 1a). Moreover, 
CED mutations in Cys 194 and Cys 196 demonstrate the importance 
of the bowtie disulphide bonds. 


Implications for the large TGF-f family 


The TGF-B family consists of 33 members (Fig. 2b)’. Although 
growth-factor domains are highly conserved, prodomains vary in 
length from 169 to 433 residues, and are variously described as unre- 
lated in sequence or low in homology. However, alignment shows that 
all prodomains have a similar fold (Fig. 2a and Supplementary Fig. 2). 
Deeply buried hydrophobic residues in core secondary-structure ele- 
ments of the arm domain, that is, the «2 helix and B-strands 1-3, 6, 7 
and 10, are conserved in all members (gold side chains in Fig. le and 
orange circles in Fig. 2 and Supplementary Fig. 2). 

Most family members also contain clear sequence signatures for the 
amphipathic C-terminal portion of the «1 helix that inserts intimately 
between the two growth-factor monomers (Fig. 2 and Supplementary 
Fig. 2). A similar insertion in inhibin-x and inhibin-BA has been 
demonstrated by mapping disruptive mutations to the equivalents 
of Ile 24 and Leu 28 in TGF-B (Fig. 1f, g)**. Many family members 
also contain proline-rich latency lasso loops with lengths that are 
compatible with encirclement of the growth-factor B-finger (Fig. 2 
and Supplementary Fig. 2). Thus, a prodomain structure similar to 
that of prol'GF-B, including a portion of the straitjacket, is widespread 
in the TGF-B family. However, the low sequence identity and many 
insertions and deletions indicate substantial specializations. 

Differences in prodomain dimerization among family members are 
indicated by variations in cysteine positions. The bowtie (f-strands 8 
and 9) and its disulphides are specializations. Inhibin-« and -B sub- 
units have cysteines in similar positions, whereas other family mem- 
bers either have cysteine residues in the B7 strand or lack cysteines 
altogether in this region (Fig. 2 and Supplementary Fig. 2). 

The interface between the two arm domains in the B4 and £5 
strands is modest in size and lacks hydrophobic and conserved resi- 
dues. GDF1 and GDF15 specifically lack the 84 and $5 strands, which 
are adjacent in sequence and structure, on the edge of a f-sheet 
(Fig. la and Supplementary Fig. 2). Therefore, arm-domain dimeriza- 
tion seems to be variable or absent in some family members. 

The close relatives of TGF-B, myostatin and GDF11, which are also 
latent, show conservation of the fastener residues Lys 27 and Tyr 75 
(Fig. 2a and Supplementary Fig. 2). Myostatin regulates muscle mass 
and is stored in the extracellular matrix, bound to LTBP3. Release of 
myostatin and GDF11 from latency requires cleavage of the prodo- 
main between Arg 75 and Asp 76 by BMP1/tolloid metalloproteinases 
(reviewed in ref. 26). This cleavage is between the «2 helix and the 
fastener (Fig. 2a). Thus at least two different methods of unfastening 
the straitjacket, force and proteolysis, can release family members 
from latency. 

An increasingly large number of TGF-B family members are recog- 
nized to remain associated with their prodomains after secretion, 
including BMP4, BMP7, BMP10, GDF2, GDF5 and GDF8 (ref. 27). 
Furthermore, many of these prodomains bind with high affinity to 
fibrillin-1 and fibrillin-2. Targeting by the prodomain to the extra- 
cellular matrix may be of wide importance in regulating bioactivity in 
the TGF-B family*®. Moreover, binding to LTBPs or fibrillins seems to 
strengthen the prodomain-growth-factor complex®. Thus, although 
only a limited number of TGF-B family members are latent as 
prodomain-growth-factor complexes, the concept of latency may 
extend to other members when their physiologically relevant com- 
plexes with LTBPs and fibrillins are considered. 

The signalling range of BMP4 in vivo is increased by extracellular 
cleavage of the prodomain by furin-like proteases at a second site 
upstream of the prodomain-growth-factor cleavage site”*. Notably, 
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the second site is in the disordered loop bearing the arginine of RGD 
in TGF-B1 (Fig. 2a). Loss of the central $10 strand between the two 
cleavage sites results in loss of binding of the BMP4 prodomain to its 
growth factor’. 

The prodomain of Nodal, which binds to Cripto, targets Nodal for 
cleavage by proteases secreted by neighbouring cells”. Anti-Millerian 
hormone is secreted largely uncleaved and association with the pro- 
domain greatly potentiates its activity in vivo’. Lefty protein, which is 
involved in establishing bilateral asymmetry, is not cleaved between 
the arm and growth-factor domains, and is cleaved instead between 
the «2 helix and the fastener*’ (Fig. 2). Notably, release of the strait- 
jacket should be sufficient to enable access of type II receptors to 
growth-factor domains. 


Concluding perspective 


We have described the structure of latent TGF-B1 and a force-dependent 
mechanism for its activation by o, integrins. It is notable that so many 
members of the TGF-f family associate with fibrillins or with LTBPs, 
which co-assemble in the elastic fibres of connective tissues®. Forces 
acting on elastic fibres would extend fibrillins and LTBPs, and we 
speculate that this could weaken their association with TGF-B family 
members, enabling release and activation. It is thus possible that force- 
dependent regulation of TGF-f family activation could extend beyond 
integrin-dependent mechanisms and could be important in a wide 
variety of contexts, including regulation of bone and tissue growth. 
Although prodomains in the TGF-B family are diverse, their sequences 
are highly conserved between species”. Further studies are required to 
address the diversity of mechanisms by which prodomains regulate 
latency and activation in the TGF-B family. 


METHODS SUMMARY 


Porcine proTGF-B1 with an N-terminal tag, C4S and N147Q mutations was 
expressed in CHO-Lec 3.2.8.1 cells. The protein was purified, treated with 3C 
protease to remove tags and then crystallized. Diffraction-quality crystals were 
obtained in 6.5-7.5% (w/v) polyethylene glycol 3500, 17-18% isopropanol, 4-5% 
glycerol and 0.1 M sodium citrate, pH 5.6. Structures were solved by Se multi- 
wavelength anomalous dispersion and Hg single-wavelength anomalous disper- 
sion. Maps were improved by multi-crystal averaging. The structure was built 
manually and refined to Rwor and Rfree factors of 27.4% and 31.1%, respectively. 
The «,¢ ectodomain was expressed using C-terminal «-helical coiled-coils and 
tags, purified and subjected to negative-stain electron microscopy. TGF-f assays 
used 293T cells stably transfected with «,B. integrin and transiently transfected 
with mutant human proTGF-B1, then co-cultured with indicator cells expressing 
a luciferase gene under the control of a TGF-B1-inducible promoter. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

ProTGF-B1. The porcine proTGF-B1 construct with the rat serum albumin 
leader sequence (MKWVTFLLLLFISGSAEFS), followed by eight histidine resi- 
dues, a streptavidin-binding peptide (TTGWRGGHVVELAGELEQLRARLEHH 
PQGQREP)* and a HRV-3C protease site (LEVLFQGP) was amplified from 
pcDNA-GS-TGF-B1 (ref. 33). Porcine proTGF-B1 with the C4S mutation was 
amplified from the latter construct. The C4S mutation increases proTGF-B1 
expression” and avoids inappropriate disulphide bond formation”. No crystals 
were obtained with this construct. One or two N-linked sites were deleted in the 
N147Q and N107Q/N147Q constructs, which were expressed similarly to wild 
type*®. The best crystals were obtained with N147Q; the N107Q/N147Q mutant 
yielded needles that could not be optimized. CHO-Lec 3.2.8.1 cells were trans- 
fected by electroporation and cultured with 10 pg ml~' puromycin. Clones were 
screened for expression using a sandwich enzyme-linked immunosorbent assay 
(ELISA) with a capture antibody to prodomain-1 (R&D Systems) and a biotiny- 
lated detection antibody to the His tag (Qiagen). The clone with highest expres- 
sion of proTGE-B1 (~2 mg] ') was expanded and cultured in roller bottles with 
J/J medium and 5% fetal bovine serum (FBS). Supernatants were collected every 
5d, clarified by centrifugation, concentrated tenfold with tangential flow filtra- 
tion (Vivaflow 200, Sartorius Stedim), diluted fivefold with 10 mM Tris-HCl, 
0.14M NaCl (TBS, pH 8.0), then concentrated fivefold. Material was adjusted 
to 0.2M NaCl and purified using Ni-NTA agarose (Qiagen) (25 ml per 51 of 
culture supernatant), then washed with three column volumes of 0.6 M NaCl, 
0.01M Tris (pH 8.0) and eluted with 0.25M imidazole in TBS. Material was 
adjusted to pH 7.4, applied to Strep-tactin agarose (IBA) (3 ml per 51 of culture 
supernatant) and washed with TBS (pH 7.4). Then 4ml of recombinant His- 
tagged HRV-3C protease (Novagen, 100 Umg ', 1mgml~'), diluted 20-fold 
in TBS (pH 7.4) with 10% glycerol, was applied to the column, which was held 
at 4°C for 16h. The flow-through with two column volumes of TBS (pH 7.4), 
containing untagged proTGF-B1, was concentrated to 1ml and applied to 
MonoQ and Superdex $200 columns connected in series and equilibrated with 
TBS (pH7.5). Purified proTGF-B1 was concentrated to about 15mgml~! in 
10 mM Tris (pH 7.5), 75 mM NaCl for crystal screening in 96-well Greiner micro- 
plates (100 nl hanging-drop vapour diffusion format) using a mosquito crystal- 
lization robot (Molecular Dimensions) at 20 °C. Hits were optimized in 24-well 
plates using hanging-drop vapour diffusion. However, better-diffracting crystals 
could only be obtained from sitting drops containing equal volumes of 12-15 ul 
protein and well solution under the optimized conditions of 6.5-7.5% PEG 3500, 
17-18% isopropanol and 0.1M sodium citrate (pH 5.6), with the addition of 
4-5% glycerol to slow crystal growth and improve crystal size and shape. 
Maximum single-crystal dimensions reached 450 jim X 150 jim X 40 um. 
Before cooling the crystals to 100 K in liquid nitrogen, three rounds of increases 
in PEG 3350 concentration (12 h for each increase of 8% per cycle) were carried 
out in the mother liquor’’. The final PEG 3350 concentration of about 31% was 
sufficient for cryoprotection. Crystals are summarized in Supplementary Table 1. 
There are two complexes per asymmetric unit, with a Matthews coefficient of 
2.9 A? Da ', giving a solvent content of 57.8%. 

To prepare Se-Met proTGF-B1, cells were washed with PBS (pH 7.4), supple- 

mented with 1% FBS, then incubated for 8 h with methionine-free 7- MEM (SAFC 
Biosciences) supplemented with 50 mg] * L-Se-Met (Sigma) and 10% dialyzed 
FBS. After replacement with the same medium, cells were cultured for 4 d. Se-Met 
proTGF-f1, at a yield of 1.5mgl', was purified and crystallized identically to 
native prolGF-f1. Furthermore, a heavy-atom derivative was obtained by soak- 
ing crystals in mother liquor containing 0.4 mM HgBr, for 4h. 
Structure determination and refinement. Native Se multiple-wavelength anom- 
alous dispersion (MAD) and single-wavelength anomalous dispersion (SAD) Hg 
derivative data were collected at 100 K at beamline 23-ID, then processed using 
HKL2000 (ref. 37) and XDS*. Statistics are in Supplementary Table 1. Initial 
experimental phases were determined independently using Se-MAD and Hg- 
SAD, with 19 out of 24 Se sites and 14 Hg sites in the asymmetric unit located 
using PHENIX”’. Electron density maps from Se-Met phasing, calculated after 
fourfold non-crystallographic symmetry (NCS) averaging, clearly defined the 
orientation of each monomer. The mature TGF-$1 homodimer was easily docked 
into the map using model 1KLC with MOLREP” in CCP4 (ref. 41). The prodo- 
main was built into the map manually. A crude model of proTGF-B1 was 
obtained after rigid-body refinement by PHENIX, with both domains as one rigid 
body. The same model was docked into Hg-SAD density for the two homodimers, 
using MOLREP. 

To improve the phases and extend them to higher resolution, multi-crystal 
averaging (two crystals in total: Se- MAD and Hg derivative), multi-domain aver- 
aging (with separate masks for each prodomain and TGF-B monomer) and 
solvent flattening and histogram matching were performed using DMMULTI” 
from the CCP4 suite. The mask for each domain was calculated by NCSMASK in 


CCP4, and NCS matrices for each domain between molecules and crystals were 
computed by LSQKAB in CCP4. Rigid-body refinements were carried out by 
PHENIX for each lattice, on the basis of the averaged maps. The new models for 
each lattice were then used to calculate a set of new NCS matrices for the next 
cycle of DMMULTI. These steps were cycled twice. 

Model building in COOT* was based on multi-crystal, multi-domain averaged 
electron density maps and 2F,— F. maps. NCS restraints and translation- 
libration-screw (TLS) groups were used in refinement with PHENIX. The 
sequence-to-structure register was confirmed using Se anomalous maps. The 
multi-crystal, multi-domain averaging was repeated using the refined structure 
at an Ryee Of about 33% and no major differences were found. Two residues from 
the 3C protease site remain at the N terminus after cleavage. The structures include 
residues 0-62, 70-208, 216-241, 250-361 and one N-acetylglucosamine (NAG) 
residue (chain A); residues 1-62, 70-208, 216-242, 250-299, 310-361 and two 
NAG residues (chain B); residues 1-62, 68-208, 216-241, 250-361 and three NAG 
residues (chain C) and residues 0-62, 69-208, 216-242, 250-361 and two NAG 
residues (chain D). Validation and Ramachandran statistics used 
MOLPROBITY™. All structure figures were generated using Pymol (DeLano 
Scientific). 

In the asymmetric unit of the crystal, a second pro-TGF dimer extends each 
two-stranded f-ribbon to form a four-stranded, inter-dimer super-f-sheet in 
which Leu 203 forms a hydrophobic lattice contact. In its absence, Leu 203 may 
mediate hydrophobic interactions within the bowtie. 

Mutagenesis. Wild-type human proTGF-B1 was inserted into the pEFl-puro 
plasmid. Site-directed mutagenesis was performed using QuikChange 
(Stratagene). All mutations were confirmed by DNA sequencing. 

Free cysteine labelling and prodomain detection. HEK-293T cells were trans- 
fected using Polyfect reagent (Qiagen) according to the manufacturer’s instruc- 
tions, using 241g of proTGF-$1 cDNA per 6-cm dish of cells at 70-80% 
confluency. The cells were then cultured in FreeStyle serum-free medium 
(Invitrogen) for 3d. Supernatant was reacted with 450,1M_ biotin-BMCC 
(1-biotinamido-4-(4’-(maleimidoethylcyclohexane)-carboxamido)butane) (Pierce) 
for 60min at 22°C, followed by the addition of 40mM N-ethylmaleimide. 
ProTGF-B1 was immunoprecipitated with 1.5j1g anti-human-prodomain-1 
(LAP-1) antibody (R&D Systems) and Protein A Sepharose beads (GE 
Healthcare) at 4°C for 2h, then subjected to reducing SDS 10% PAGE. After 
transfer to polyvinylidene difluoride membranes (Millipore), biotin was detected 
using streptavidin-horseradish peroxidase with the ECL-plus western blotting kit 
(GE Healthcare). Total proTGF-B1 was similarly detected on a separate blot using 
biotinylated human prodomain-1 (LAP-1) antibody. 

TGF-1 activation assay. Transformed mink lung epithelial cells (TMLCs) stably 
transfected with a luciferase construct under plasminogen activator inhibitor 
promoter 1 (ref. 5) were provided by D. Rifkin (New York University). HEK- 
293T cell transfectants stably expressing o%, and Bh, were selected with puromycin 
and G418. Clones expressing high levels of integrin «Bs were selected by immu- 
nofluorescent flow cytometry using an anti-B. antibody. Cells stably transfected 
with empty vector were used as a control. These cells were subsequently transi- 
ently transfected with human wild-type or mutant proTGF-B1, using lipofecta- 
mine with 0.4 jig plasmid DNA per well in a 48-well plate. After 16-24 h, each well 
was used to seed 3 wells of a 96-well plate with about 15,000 cells, which were co- 
cultured with 15,000 TMLCs in 100 pn] DMEM with 0.1% BSA for 16-24 h. TGF- 
B1-induced luciferase activity in cell lysates was measured using the luciferase 
assay system (Promega). To assess heat-releasable TGF-1, cells were transfected 
as above except that polyfect was used in 6-well plates. After 2d, cells were 
collected and heated in 150 ul of DMEM with 0.1% BSA at 80°C for 10 min. 
TGF-B1 activity in 50 pil aliquots was measured using the luciferase assay with 
TMLCs. 

Negative-stain electron microscopy. A large latent-complex fragment was iso- 
lated from supernatants of 293T cells transiently co-transfected with native 
human proTGF-f and a human LTBP fragment containing the same N- 
terminal tag as was used above on proTGF-f. The LTBP fragment contained 
the TB3 and TB4 domains and two intervening EGF-like domains (residues 
Thr 1333-Asn 1578, immature numbering). Multi-angle light scattering gave 
an M, of 119,400, compared to a calculated M, of 120,400 for a 2:1 proT'GF- 
B:LTBP fragment (2 X 46,400 for proTGF-B, including 27,500 for three 
high-mannose N-linked sites, plus 27,600 for the LTBP fragment). The a, 
ectodomain was expressed using C-terminal «-helical coiled-coils and tags; puri- 
fied, then subjected to negative-stain electron microscopy as previously 
described”. Purified proTGF-B1 or proTGF-f1 in complex with an LTBP frag- 
ment (20 j1g), proTGF-B1 (30 1g) in molar excess over clasped «,f¢ (20 1g), or 
clasped 0%, (60 jug) in excess over proTGF-B (10 1g) were subjected to Superdex 
$200 chromatography in TBS (pH 7.5) with 1mM Ca** and 1 mM Mg?’*. Peak 
fractions corresponding to the purified proteins or complexes were subjected to 
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X-ray structure of a bacterial 
oligosaccharyltransferase 


Christian Lizak'*, Sabina Gerber’, Shin Numaolt, Markus Aebi! & Kaspar P. Locher? 


Asparagine-linked glycosylation is a post-translational modification of proteins containing the conserved sequence 
motif Asn-X-Ser/Thr. The attachment of oligosaccharides is implicated in diverse processes such as protein folding 
and quality control, organism development or host-pathogen interactions. The reaction is catalysed by 
oligosaccharyltransferase (OST), a membrane protein complex located in the endoplasmic reticulum. The central, 
catalytic enzyme of OST is the STT3 subunit, which has homologues in bacteria and archaea. Here we report the 
X-ray structure of a bacterial OST, the PgIB protein of Campylobacter lari, in complex with an acceptor peptide. The 
structure defines the fold of STT3 proteins and provides insight into glycosylation sequon recognition and amide nitrogen 
activation, both of which are prerequisites for the formation of the N-glycosidic linkage. We also identified and validated 
catalytically important, acidic amino acid residues. Our results provide the molecular basis for understanding the 


mechanism of N-linked glycosylation. 


It is estimated that more than half of all eukaryotic proteins are glyco- 
proteins; that is, specific amino acid side chains are chemically modi- 
fied with carbohydrates in a process termed glycosylation’’. The most 
abundant of these modifications is asparagine-linked (N-linked) 
glycosylation, which affects a multitude of cellular functions’. 
Asparagines are specifically glycosylated in the context of a consensus 
sequon Asn-X-Ser/Thr when located in the endoplasmic reticulum 
(ER). The reaction takes place on the luminal surface of the ER mem- 
brane and is catalysed by OST, a hetero-oligomeric membrane protein 
complex in most eukaryotes®. A hallmark of N-linked glycosylation is 
its broad specificity with respect to the polypeptide substrate, which is a 
direct consequence of the short recognition sequon’. This distinguishes 
OST from O-glycosyltransferases that modify serine or threonine resi- 
dues and exhibit a higher specificity for their protein substrates’. 

The key step of OST-catalysed glycosylation is the formation of an 
N-glycosidic linkage between the amide nitrogen of the acceptor 
asparagine and the Cl carbon of the first saccharide moiety of a 
lipid-linked oligosaccharide (LLO) donor (Fig. 1a). This results in 
the en bloc transfer of the oligosaccharide onto the acceptor aspar- 
agine. Details of the reaction mechanism are poorly understood owing 
to the absence of structural insight into OST at high resolution, but 
also to the complex chemical nature of LLO, its low abundance in 
biological samples, and its insolubility in water. In contrast, crystal 
structures of various soluble O-glycosyltransferases have been pub- 
lished and their reaction mechanisms investigated in detail". For 
OST, the currently accepted model suggests that glycosylation 
sequons are recognized when in unfolded protein segments'*”’, either 
during protein translocation into the ER or after translocation is 
completed’*. The central, catalytically active component within 
OST is the STT3 subunit®'’”. The other subunits are thought to assist 
and refine the reaction'*”’. 

N-linked glycosylation is not restricted to eukaryotes. Homologous 
processes are found in archaea and in defined taxa of proteobac- 
teria’. Eukaryotic and bacterial LLOs contain isoprenoid moieties 
that anchor the oligosaccharides in the membrane, and pyrophos- 
phates as leaving groups of the substitution reactions. However, the 


attached oligosaccharides are chemically distinct, and unlike their 
eukaryotic counterparts, the OSTs of prokaryotes (and of eukaryotic 
kinetoplastids) consist of a single subunit, which is homologous to the 
STT3 subunits of eukaryotic OST complexes (Supplementary Fig. 
1)”. The best-studied bacterial OST, termed PglB (84 kDa), is 
encoded in the protein glycosylation locus pgl of the Gram-negative 
bacterium Campylobacter jejuni. This gene cluster is sufficient for 
catalysing protein glycosylation when transferred into Escherichia coli 
cells”. The similarity in sequence and membrane topology indicates 
that PglB and eukaryotic STT3s share a common reaction mech- 
anism*°***’. To understand the molecular basis of N-linked glycosyla- 
tion, we have determined the X-ray structure of PglB from C. lari, 
which is 56% identical to that of C. jejuni**. C. lari PglB is functional 
when co-expressed with the C. jejuni pgl cluster in E. coli cells 
(Fig. 1b). We co-crystallized C. lari PglB with the hexapeptide 
DQNATE, an optimal acceptor sequence for C. jejuni PglB’’. X-ray 
diffraction data was anisotropic and extended to 3.4 A resolution. The 
structure was refined to R/Rfree values of 23.8% and 27.1%, respec- 
tively (Supplementary Table 1). 


Structure of C. lari PgIB 
In agreement with earlier predictions (reviewed in ref. 6), the structure 
of PglB revealed two domains: a transmembrane domain comprising 
residues 1-432 and a periplasmic domain comprising residues 433- 
712 (Fig. 1c). In addition to the covalent linkage, the two domains have 
extensive non-covalent interactions, provided mainly by the first 
external loop (EL1) of the transmembrane domain that forms two 
helices parallel to the membrane plane. The periplasmic domain fea- 
tures a mixed o/f fold that was previously observed in the structures of 
the homologous domains of C. jejuni PglB and of Pyrococcus furiosus 
AglB*°*!. However, these isolated domains were catalytically inactive 
and unable to bind acceptor peptide. Our structure of full-length PglB 
provides a molecular explanation by revealing that the transmembrane 
domain is indispensible both for peptide binding and catalysis. 

In contrast to the periplasmic domain, the transmembrane domain 
features a novel fold (Fig. 2). Thirteen transmembrane segments are 
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Figure 1 | Activity and structure of C. Jari PgIB. a, Reaction scheme of 
N-linked glycosylation, yielding an N-glycosidic bond (red). In bacteria, 

R, = oligosaccharyl, R, = NH-Ac, R3 = CHs. In eukaryotes, R; = OH, 

R, = oligosaccharyl, R; = CH,OH. b, In vivo glycosylation assay in E. coli. 
Immunoblots detecting acceptor protein 3D5 (top), glycans (middle) or PglB 
(bottom). PgIB constructs indicated above the lanes include vector control (ev), 
wild type (WT), or PglB mutants. Glycosylation yields a mobility shift from the 
unmodified (g0) to the glycosylated form (g1). Functional Pg|B is partially auto- 


connected by short cytoplasmic and external loops, with the exception 
of the long external loops EL1 and EL5. Whereas EL] is well ordered, 
ELS is only partially ordered, with 25 residues disordered in the elec- 
tron density map. Transmembrane segments TM1-4 and TM10-13 
form the sequon-binding and catalytic sites and provide the bulk of 
the interface with the periplasmic domain. In the peptide-bound state, 
PgIB forms two large cavities above the membrane surface, located at 
opposite sides of the protein (Fig. 1d). The left-side cavity provides 
access for acceptor proteins as suggested by the presence of bound 
peptide in the structure, whereas the right-side cavity harbours the 
catalytic residues and probably serves as the binding pocket for LLO. 
The two cavities are connected where the side chain of the acceptor 
asparagine reaches from the peptide-binding site into the catalytic 
pocket. 


Acceptor sequon binding and recognition 

Clear density for the bound hexapeptide DQNATF was observed in a 
location that placed the acceptor asparagine some 15A above the 
membrane surface (Fig. 1c). Almost 80% of the contact surface of 
the peptide (calculated by areaimol*’) is buried at the interface of 
the transmembrane and periplasmic domains of PglB (Fig. 3a), indi- 
cating tight binding and a firmly imposed conformation. The peptide 
forms a loop that almost completes a 180° turn; accordingly, protein 
substrates have to present their glycosylation sequons in sufficiently 
large, flexible and surface-exposed loops*’, because the peptide- 
binding cavity of PglB does not appear to fit fully folded protein 
domains. The observed conformation of the peptide would be incom- 
patible with a proline residue at the +1 position, in agreement with 
the observation that +1 prolines are not allowed in glycosylation 
sequons***?. 
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glycosylated at N535 and N556, resulting in two additional bands (g1 and g2). 
c, Ribbon diagrams of PgIB structure, with transmembrane and periplasmic 
domains in blue and green, respectively, and bound acceptor peptide in stick 
representation and coloured yellow. The presumed position of the membrane is 
indicated by a grey rectangle. d, Surface representation in semi-transparent 
grey, with ribbons as in a. Two cavities at opposite sides of PglB are indicated by 
dashed lines, providing access for substrates. The cavities are connected by a 
tunnel that accommodates the acceptor asparagine. 


A hallmark of N-linked glycosylation is the requirement of a serine 
or threonine at the +2 position of the acceptor sequon. The PglB 
structure provides a molecular explanation by revealing that the 
B-hydroxyl group of the +2 Thr of bound hexapeptide forms three 
hydrogen bonds, one with each of the side chains of the ‘WWD motif, 
which is strictly conserved in STT3 proteins (Fig. 3b and Supplemen- 
tary Fig. 3). The motif is located in the periplasmic domain, and the 
interaction of the two tryptophan and the aspartate side chains 
saturate the hydrogen-bonding capacity of the B-hydroxyl group. 
The arrangement physically separates the +2 Thr from the acceptor 
asparagine, and we conclude that the WWD motif defines the poly- 
peptide substrate specificity, but is not directly involved in catalysis. 
Notably, the structure can also explain preferences and deviations at 
the +2 position of glycosylation sequons. The y-methyl group of the 
+2 Thr is in van der Waals contact with Ile 572 of PgIlB (3.6 A distance 
to the y-methyl group of Ile 572; Fig. 3b and Supplementary Fig. 4). 
This stabilizing interaction is absent if a serine is in the +2 position, 
which may explain that acceptor sequons containing a +2 Thr are 
glycosylated 40 times more efficiently than if they contain a +2 Ser 
(ref. 36). The structure further suggests that the non-natural, 
S-configured threonine would cause a steric clash with Ile572. 
S-configured threonine is indeed not allowed at the +2 position, with 
a 15,000-fold reduction in glycosylation efficiency compared to 
R-configured threonine’’. Ile572 is conserved in bacteria and has 
been suggested to be part of a MXXI motif*’. However, the corres- 
ponding residue in the archaeal AglB protein was found to be a 
lysine**, and sequence alignments with eukaryotic STT3 homologues 
reveal no clear conservation of Ile 572, indicating that residues other 
than isoleucine can provide contacts to the +2 Thr in homologous 
proteins. The PgIB structure can also explain allowed deviations from 
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Figure 2 | Topology of transmembrane domain. a, Ribbon diagram, with 
helices numbered and coloured as in b. The periplasmic domain is shown as a 
green backbone trace. b, Topological schematic indicating helices and 
connecting loops. Dashed green lines indicate non-covalent contacts to the 
periplasmic domain. Conserved residues forming the active site of PglB are 
indicated by yellow stars and spheres and are labelled. R331 contributing to 
peptide recognition is indicated in red. 


the consensus sequon: the acceptor sequence N-X-C, present in 
~2.2% of experimentally determined glycosylation sites of the mouse 
glycoproteome’, is probably allowed because the B-sulphhydryl group 
of cysteine can form similar hydrogen bonds as a B-hydroxyl group. 
Glycines, alanines and valines have also been reported at the +2 
position of glycosylated sequons, albeit only at low abundance***”’. 
These residues can, in principle, be accommodated in the bind- 
ing pocket of PglB because they are equal in size or smaller than 


a Periplasmic 
domain 


TM domain 


Figure 3 | Sequon binding and recognition. a, Transmembrane (TM) and 
periplasmic domains of Pg]B are in blue and green, respectively. Acceptor 


threonine. However, glycosylation of sequons such as N-G-X, with 
X being larger than threonine, or of T/S-X-N (reverse sequons)* can- 
not be explained by the PgIB structure. 

Compared to the eukaryotic enzymes, bacterial OSTs analysed thus 
far have an extended acceptor sequon: glycosylation is only efficient if 
a negatively charged residue (Asp or Glu) is present at the —2 posi- 
tion, resulting in a consensus sequon D/E-X-N-X-S/T*. In PgIB, R331 
provides a salt bridge to the —2 Asp of bound peptide (Fig. 3a and 
Supplementary Fig. 4), thereby strengthening the PglB—peptide inter- 
action. R331 is conserved in bacteria, but not in eukaryotes, where no 
requirement for a negative charge at the —2 position is observed. The 
extended sequon recognition may reflect the need for tighter peptide 
binding in bacteria, where the local concentration of the acceptor 
polypeptide is probably lower than in eukaryotes. 


Catalytic site 

The catalytic pocket is located in the right-side cavity of PglB (Fig. 1d) 
and is marked by a bound cation, located ~8 A above the membrane 
boundary. Owing to the high concentration of magnesium salt in the 
crystallization solution, it was modelled as Mg". PgIB, like all OSTs, 
is only functional with a bound divalent cation (M**)***°. The physio- 
logical cation was suggested to be Mn’*, but PglB is also active in 
Mg*", a property that was previously observed for other metal- 
dependent glycosyltransferases’’. The catalytic pocket of PglB features 
three acidic side chains (D56, D154, E319) that are part of the trans- 
membrane domain and coordinate M’* (Fig. 4 and Supplementary 
Fig. 5). At the current resolution, water molecules that might be 
additional ligands of M** cannot be modelled. The residues located 
in the catalytic pocket are generally conserved in STT3 proteins 
(Supplementary Fig. 3). The aspartates D154 and D156 belong to a 
previously reported D-X-D motif, and mutation of either aspartate to 
an alanine abolished the activity of the mannosyltransferase GPI-MT- 
1, a member of the same glycosyltransferase superfamily as PglB (GT- 
C)***. In contrast, D56 and E319 have not been previously identified 
as catalytically relevant, but their carboxyl groups interact both with 
M’* and the amido group of the acceptor asparagine. Such interac- 
tions would not be possible if a glutamine or an aspartate side chain 
was present instead of the acceptor asparagine, which may explain 
why glutamines and aspartates are not glycosylated by OST. To con- 
firm the catalytic involvement of the three M’*-binding residues, we 
mutated them individually to alanines and tested the activity of the 
resulting PglB mutants in a complementation assay (Fig. 1b). Even 
though OST activity is not limiting in our assay, the mutation D154A 
reduced the observed glycosylation yield by >50%, whereas D56A 


W463 


Thr 


(+2) ) 


lle 572 


Asn 


b, PgIB residues interacting with the +2 Thr of bound peptide are shown in 
green and labelled. Hydrogen bonds from the WWD motif to the B-hydroxyl 


peptide is in ball and stick representation; N and C denote amino and carboxy _ group are indicated by dashed lines. 


termini. Peptide residues are labelled yellow, acceptor Asn is labelled red. 
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Figure 4 | Catalytic site and amide nitrogen activation. a, Transmembrane 
and periplasmic domains of PgIB are coloured blue and green, respectively. 
Selected side chains are in ball and stick representation, with carbon atoms 
coloured cyan and green for transmembrane and periplasmic domain residues, 
respectively, and yellow for acceptor peptide. Grey dashed lines indicate 
hydrogen bonds or interactions with the divalent cation M’*. b, Chemical 
structure of the catalytic site, indicating interactions as in a. Blue and yellow 


and E319A reduced it by >90%. A PglB double mutant D56A/E319A 
was completely inactive. 

There is a controversial discussion on how the amido group of the 
acceptor asparagine might be activated for a nucleophilic attack on the 
C1 carbon of LLO substrate. Amides are poor nucleophiles because 
the free electron pair of the nitrogen is conjugated to the carbonyl 
group. As a consequence, the N-C bond has double-bond character, 
and the nucleophilicity of the nitrogen is low. To explain the reactivity 
of the amido group, specific conformations of the acceptor peptide, 
such as a ‘B-turn’ or an “Asx turn’, have been proposed, invoking 
direct involvement of the +2 Ser/Thr in catalysis***’. Given the firm 
binding of the +2 Thr to the WWD motif in our PglB structure, we 
can rule out such an involvement. Instead, the structure of PglB pre- 
sents a distinct possibility for explaining amide nitrogen activation: 
the catalytically essential residues D56 and E319 are optimally posi- 
tioned to form hydrogen bonds with the two amide protons of the 
acceptor asparagine. Forming such hydrogen bonds would require a 
rotation of the N-C bond of the amido group, thereby abolishing the 
conjugation of the nitrogen electrons with the carbonyl group 
(Fig. 4c). Not only would this increase the electronegative nature of 
the amide nitrogen (by polarizing the N-H bonds and increasing the 
electron density on the nitrogen), but it would also generate an sp” 
hybridized nitrogen with a reactive lone pair optimally positioned for 
a nucleophilic attack. The energy barrier for rotating the N-C bond in 
most amides is estimated to be 16-20 kcal mol ', and the 270° amide 
conformation shown in Fig. 4c has been calculated to have an energy 
of ~18.6 kcal mol? relative to the planar conformation“. Hence, it 
would take 1-2 low-barrier hydrogen bonds* (each worth ~10 kcal 
mol ') to provide sufficient energy to break permanently the con- 
jugation of the carboxamide group of the acceptor asparagine. The 
carboxyl groups of D56 or E319 might provide such interactions in 
the transition state of the reaction. However, it will require a higher 
resolution structure to measure reliably the lengths of the hydrogen 
bonds. Mutating D56 to asparagine (D56N) has an even more pro- 
nounced inhibitory effect than truncation to alanine, and the E319Q 
mutant is completely inactive (Fig. 1b). This demonstrates that the 
negative charges provided by the carboxyl groups of D56 and E319 are 
essential for catalysis and that the acidic side chains cannot be 
replaced by iso-electronic amides. Steric effects might explain the 
increased inhibition of D56N relative to D56A and of E319Q relative 
to E319A. 
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dashed lines indicate transmembrane domain and peptide backbones, 
respectively. c, Presumed mechanism of amide nitrogen activation. Yellow 
dashed lines indicate peptide backbone. The amido group of free acceptor Asn 
features electron delocalization, indicated by resonance. When bound to PgIB, 
the amido group may form hydrogen bonds with the catalytically essential D56 
and E319, requiring rotation around the C-N bond (red arrow). 


Glycosylation mechanism 


Given that PglB is active even when solubilized in detergent (used for 
crystallization), our structure has probably captured a functionally 
competent state. Glycosylation occurs with inversion of configuration 
at the substituted C1 carbon. We modelled the LLO substrate into the 
PgIB structure such that the di-N-acetyl-bacillosamine moiety is 
properly aligned for a nucleophilic attack by the activated amide 
nitrogen, while the leaving pyrophosphate group is in contact with 
M”* and the conserved R375 (Fig. 5a and Supplementary Fig. 3). This 
arrangement places the additional saccharide moieties in the right- 
hand cavity of PglB and the C2 substituent of the first saccharide 
moiety, a N-acetyl group present in bacterial and eukaryotic LLO, 
in the vicinity of a conserved tyrosine residue (Y468, Supplemen- 
tary Fig. 3), where density consistent with a bound water molecule 
is observed. When modelled as shown in Fig. 5a, the isoprenoid moieties 
of LLO are embedded in a hydrophobic groove on the PglB surface, 
pointing into the lipid bilayer (Supplementary Fig. 6). The function of 
M?* in PglB appears to be twofold: on the one hand, it orients the acidic 
residues D56 and E319 that interact with the acceptor asparagine, and 
on the other, it stabilizes the leaving group (lipid-pyrophosphate) of 
the substitution. The dual function seems to be distinct from metal- 
dependent, configuration-inverting glycosyltransferases of the GT-A 
family, where M** primarily serves the stabilization of the leaving 
group". 

With the acceptor peptide present in the structure and the LLO 
molecule tentatively modelled, we can propose a basic, three-state 
catalytic cycle for PglB-catalysed glycosylation (Fig. 5b). A critical 
element of the proposed mechanism is the engagement and dis- 
engagement of EL5, which we expect to be flexible and disordered 
in the absence of bound acceptor peptide (ground state). Upon 
sequon binding, the C-terminal half of EL5 is ordered and pins the 
bound acceptor peptide against the periplasmic domain of PglB, 
thereby restricting its motion. Because the essential E319 is part of 
ELS, this simultaneously results in the formation of the catalytic site, 
where the acceptor asparagine is oriented and the amide nitrogen 
activated. Binding of LLO will then result in a nucleophilic attack of 
the activated amide nitrogen, resulting in glycosylation. Once the 
glycosidic bond is formed, the newly attached saccharides are tightly 
pressed against PglB, causing steric tension that can be released by 
disengagement of EL5. This allows the glycopeptide to dissociate from 
the enzyme. Subsequent cleavage of the lipid-linked pyrophosphate 
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Figure 5 | Proposed glycosylation mechanism of PglB. Transmembrane 
(TM) and periplasmic domain surfaces are shown in blue and green, 
respectively. Bound acceptor peptide is in ball and stick representation, and 
yellow lines indicate the N and C termini. a, The chemical structure of bacterial 
LLO is shown in white, highlighting presumed interactions with M** and R375 
of PgIB, and a collinear arrangement of attacking and leaving groups of the 
substitution. A red arrow indicates the nucleophilic attack. A predominantly 
hydrophobic groove accommodating the isoprenoid moieties is indicated. b, The 
observed crystal structure reflects the top state, with acceptor peptide bound and 
the C-terminal half of EL5 ordered and engaged. The bottom-left state reflects 
the ground state, with no substrates bound and ELS disordered, as indicated by a 
dashed line. In the bottom-right state, bound C. jejuni LLO is shown (red line for 
isoprenoid moieties, P for phosphate, ellipsoids for saccharide moieties). The 
molecular events indicated by arrows are: 1, sequon binding, EL5 engagement, 
acceptor asparagine activation; 2, LLO binding, glycosylation; 3, disengagement 
of ELS, release of glycosylated sequon and lipid-pyrophosphate (PP). 


anhydride and folding of the newly glycosylated protein probably 
provide the main contributions to the driving force of the in vivo 
reaction. We should point out that there is no experimental evidence 
indicating a strict sequence of events. Thus, the binding of LLO to 
PgIB might precede that of acceptor peptide. 
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Conclusions and outlook 


Our results provide the basis for understanding the mechanism of 
N-linked glycosylation at a molecular level. Future studies will be direc- 
ted at probing the outlined chemical concepts, which are at the core of 
both bacterial and eukaryotic OST. In addition, the interaction of the 
catalytic STT3 with the additional subunits in eukaryotic OST needs to 
be defined. At present, the highest resolution structure of a eukaryotic 
OST isa 12 A electron microscopy map of the yeast enzyme“. The PglB 
structure also provides opportunities for engineering the substrate spe- 
cificity both with respect to the acceptor sequon and LLO, which may 
open up new avenues for the production of glycoprotein and glycocon- 
jugate therapeutics. 


METHODS SUMMARY 


In vivo glycosylation assay. Escherichia coli SCM6 cells were transformed with 
three separate plasmids containing: (1) the C. jejuni pg/B,,,,, cluster (containing an 
inactivated pg/B gene) to generate and flip LLO; (2) the glycosylation acceptor 
protein 3D5, a single-chain Fv fragment containing a DQNAT acceptor sequon”; 
(3) C. lari PglB, wild type or mutants. Expression and glycosylation of 3D5 was 
monitored by SDS-PAGE of periplasmic cell extracts and visualized by mobility 
shift due to increased size in an immunoblot using anti-c-Myc antibody, or the 
reactivity of the glycoprotein in an anti-glycan immunoblot using hR6 antiserum. 
Expression of C. lari PgIB was monitored by immunoblots of whole-cell extracts 
using anti-haemagglutinin (HA) antiserum. 

Crystallization and structure determination. Campylobacter lari PglB contain- 
ing a C-terminal His, tag was expressed in E. coli cells and purified in B-p-dodecyl 
maltoside. PglIB was co-crystallized with an acceptor peptide Ac-DQNATF 
{4-NO,}-NH,, containing an acetylated N terminus and an amidated C terminus. 
For experimental phasing, crystals were soaked in ethyl mercury phosphate (EMP) 
before data collection. Three distinct data sets of EMP derivatives were collected. 
The phase problem was solved by a combination of MIRAS and molecular replace- 
ment, using the periplasmic domain of C. jejuni PglB (Protein Data Bank code 
3AAG) as a search model. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


In vivo complementation. To analyse the activity of PglB from C. lari in vivo the 
gene encoding PgIB was amplified from the pgl gene cluster of Campylobacter lari 
isolate (sample provided by H. Hachler) genomic DNA by polymerase chain 
reaction (PCR) and was cloned into a pMLBAD plasmid“ with a C-terminal 
HA tag fused to PgIB, resulting in the plasmid pMIK71 (ref. 28). For comple- 
mentation studies, pMIK71 or pMLBAD empty vector were transformed into E. 
coli SCM6 cells carrying the plasmids pCL21 (ref. 47) and pACYCpglmut (ref. 25). 
pCL21 encodes for the expression of the single-chain Fv fragment of 3D5 carrying 
a DQNAT glycosylation site in the linker region and a C-terminal Myc tag fused 
to 3D5. pACYCpglnur codes for the biosynthesis of the C. jejuni lipid-linked 
oligosaccharide (LLO) with an inactivated C. jejuni pglB gene (W458A and 
D459A). A 5 ml pre-culture was inoculated from a single clone and grown over 
night at 37°C in LB medium. The main culture was inoculated to an optical 
density (Agoo) of 0.05 in 15 ml LB medium and grown at 37 °C to Agoo of 0.5. 
The culture was induced by addition of arabinose to 0.1% (w/v) and grown for 4h 
at 24°C. For extraction of periplasmic proteins, an equivalent of 1 ml culture 
volume with an A¢oo of 3 was harvested by centrifugation, re-suspended in 150 pl 
extraction buffer, consisting of 30 mM Tris-HCl, pH 8.5; 20% (w/v) sucrose; 
1mM EDTA and 1 mgml * lysozyme (Sigma) and incubated for 1h at 4°C. A 
final centrifugation step yielded periplasmic proteins in the supernatant. 
Glycosylation of 3D5 and expression of PglB were analysed by immunoblot 
following SDS-PAGE. Immunodetection was performed with anti-c-Myc mono- 
clonal antibody (Calbiochem) and anti-glycan serum hR6 (S. Amber and M. Aebi, 
personal communication) to observe glycosylated 3D5. Immunodetection of C. 
lari PglB was performed with anti-HA antiserum (Santa Cruz). All experiments 
were performed at least in triplicate, and representative samples are shown. 
Mutagenesis study. Mutant PglB was generated by the QuickChange method, 
and the resulting plasmids of all constructs were validated by DNA sequencing. 
The mutant PglB variants were cloned into pMLBAD as above and used in 
complementation assays. 

PgIB purification. The gene encoding PglB was cloned into a modified pBAD 
(Invitrogen) expression plasmid with a C-terminal decahistidine affinity tag fused 
to PgIB, resulting in the plasmid pSF2. Owing to the applied cloning strategy, PglB 
carried the mutation K2E and the plasmid was confirmed by DNA sequencing 
(Microsynth). 

PgIB from C. lari was overexpressed from pSF2 in E. coli BL21-Gold (DE3) 
(Stratagene) cells in a 301 fermentor (Infors). Cells were grown at 37 °C in Terrific 
Broth medium supplemented with 1% glycerol (w/v) to Agoo of 10.0 before the 
culture was induced by the addition of 0.1% arabinose (w/v) for 2 h. All following 
steps were performed at 4 °C unless specified differently. Cells were harvested by 
centrifugation, re-suspended in 25mM Tris-HCl, pH 8.0; 250mM NaCl and 
disrupted in a M-110L microfluidizer (Microfluidics) at 15,000 p.s.i. chamber 
pressure. Membranes were pelleted by ultracentrifugation at 100,000g for 0.5 h. 
PglB was solubilized in 25 mM Tris-HCl, pH 8.0; 250mM NaCl; 10% glycerol 
(v/v) and 1% N-dodecyl-B-p-maltopyranoside (w/v) (DDM, Anatrace) for 1h. 
All subsequent buffers contained DDM as detergent. 

The supernatant was supplemented with 25mM imidazole and loaded onto a 
NiNTA superflow affinity column (Qiagen), washed with 60mM _ imidazole 


before PglB was eluted with 200 mM imidazole. The protein was desalted into 
10 mM MES-NaOH, pH 6.5; 100 mM NaCl; 0.5 mM EDTA; 3% glycerol (v/v); 3% 
polyethylene glycol 400 (v/v) and concentrated to 7-10 mg ml! in an Amicon 
Ultra-15 concentrator (Millipore) with a molecular mass cutoff of 100 kDa. 
Native crystals. The peptide Ac-DQNATF{4-NO;}-NH) was added to concen- 
trated PgIB to a final concentration of 0.75 mM, incubated for 0.5 h, and crystal- 
lized by vapour diffusion in sitting drops at 20 °C against a reservoir of 100 mM 
glycine, pH 9.4; 50 mM magnesium acetate; 6% dimethyl sulphoxide (DMSO) (v/ 
v) and 23-34% (v/v) polyethylene glycol 400. The protein to reservoir volume 
ratio in the sitting drop was 2:1. Crystals typically appeared after 3-4 weeks and 
matured to full size within 6 weeks. Crystals were directly flash frozen by immer- 
sion in liquid nitrogen before data collection. 

Heavy-metal derivatives. Native crystals were soaked for 30-60 min in 1mM 
ethyl mercury phosphate (EMP) before back-soaking and flash-freezing by 
immersion in liquid nitrogen. 

Data collection. Crystals belonged to the space group P2)2;2;, with one PglB- 
peptide complex in the asymmetric unit. Native data were collected at the micro- 
diffractometer beamline X06SA at the Swiss Light Source (SLS, Villigen) because 
not all sections of the crystals diffracted equally well. EMP2 and EMP3 derivative 
data sets (Supplementary Table 1) were collected at the same station, whereas 
EMP!1 was collected at the high-resolution station of the same beam line. Data 
were processed and merged with XDS* or HKL2000 (HKL Research, Inc.) and 
anisotropic scaling/ellipsoid truncation was performed”. 

Structure determination. The structure was determined using a combination of 
molecular replacement using the periplasmic domain of C. jejuni PglB (Protein Data 
Bank code 3AAG) as a search model and Phaser” on the one hand, and multiple 
isomorphous replacement with anomalous scattering (MIRAS) using SHARP (Global 
Phasing Limited) on the other. The process of phase calculation and model building 
(using O*) and refinement (using Phenix”’) was iterated, starting with the periplasmic 
domain and extending into the best-ordered regions of the transmembrane domain 
(TM1-4 and TM10-13), followed by TM5-9. The locations of three cysteines in the 
transmembrane domain (indicated by Hg anomalous peaks) served as starting points 
for tracing, until very good density allowed placement of bulky residues, confirming 
the sequence register. The final structure excludes two disordered loops of PglB (resi- 
dues 283-306 and residues 605-607) as well as the C-terminal polyhistidine tag. Data 
collection and refinement statistics are given in Supplementary Table 1. 
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Black hole growth in the early Universe is 
self-regulated and largely hidden from view 


Ezequiel Treister!?, Kevin Schawinski**, Marta Volonteri”, Priyamvada Natarajan 


The formation of the first massive objects in the infant Universe 
remains impossible to observe directly and yet it sets the stage for 
the subsequent evolution of galaxies’ *. Although some black holes 
with masses more than 10° times that of the Sun have been detected 
in luminous quasars less than one billion years after the Big 
Bang**, these individual extreme objects have limited utility in 
constraining the channels of formation of the earliest black holes; 
this is because the initial conditions of black hole seed properties 
are quickly erased during the growth process®. Here we report a 
measurement of the amount of black hole growth in galaxies at 
redshift z = 6-8 (0.95-0.7 billion years after the Big Bang), based 
on optimally stacked, archival X-ray observations. Our results 
imply that black holes grow in tandem with their host galaxies 
throughout cosmic history, starting from the earliest times. We 
find that most copiously accreting black holes at these epochs are 
buried in significant amounts of gas and dust that absorb most 
radiation except for the highest-energy X-rays. This suggests that 
black holes grew significantly more during these early bursts than 
was previously thought, but because of the obscuration of their 
ultraviolet emission they did not contribute to the re-ionization 
of the Universe. 

The Chandra X-ray observatory is sensitive to photons in the energy 
range 0.5-8 keV, which in deep extragalactic observations probes pre- 
dominantly accretion onto supermassive black holes’. Rapidly grow- 
ing black holes are known to be surrounded by an obscuring medium, 
which can block most of the optical, ultraviolet and even soft X-ray 
photons’. With increasing redshift, at the earliest epochs, the photons 
observed by Chandra are emitted at intrinsically higher energies, and 
are therefore less affected by such absorption. Current X-ray observa- 
tions have not been able to individually detect most of the first black 
hole growth events at z> 6 (the first 950 million years after the Big 
Bang) thus far, except for the most luminous quasars’ with X-ray 
luminosity Lx > 3 X 10“*ergs '. Whereas deep X-ray surveys do 
not cover enough volume at high redshift, current wide-area studies 
are simply not deep enough. Hence, the only way to obtain a detectable 
signal from more typical growing black holes is by adding the X-ray 
emission from a large number of sources at these redshifts; we pursue 
this strategy here. 

We start by studying the collective X-ray emission from the most 
distant galaxies known, at z ~ 6 (ref. 10), z= 7 (ref. 11) and z ~ 8 (ref. 
12), detected by the Wide Field Camera aboard the Hubble Space 
Telescope. These galaxies are as massive as today’s galaxies (stellar 
mass!* 10°-10''M., where Mo is the solar mass), and they are thus 
likely to harbour substantial central black holes. None of the z>6 
galaxies studied in this work are individually detected in the 
Chandra X-ray observations. However, we detect significant signals 
from a stack of 197 galaxies at z~6 in both the soft (0.5-2.0 keV; 
corresponding to 3.5-14keV in the rest frame) and hard (2-8 keV; 
rest-frame 14-56 keV) X-ray bands independently. The detection in 
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the soft band is significant at the 5c level, and implies an average 
observed-frame luminosity of 9.2 X 10*' ergs ', while in the hard 
band the stacked 6.80 signal corresponds to an average luminosity 
of 8.4 10 ergs '. For the sample of galaxies at z~ 7, we obtain 
30 upper limits for the average luminosity in the observed-frame soft 
and hard X-ray bands of 4X10“ ergs ' and 2.9 10% ergs ', 
respectively. Combining the z ~ 7 and z ~ 8 samples, the correspond- 
ing 30 upper limits are 3.1 X 10” ergs ' and 2.2 X 10“ ergs ‘ in the 
observed-frame soft and hard X-ray bands, respectively. 

A large difference, of a factor of ~9, is found between the stacked 
fluxes in the soft and hard X-ray bands at z~ 6. This requires large 
amounts of obscuring material with high neutral hydrogen column 
densities (Ny > 1.6 X 10°* cm’) to be present in a very high fraction 
of the accreting black holes in these galaxies, in order to explain the 
large deficit of soft X-ray photons. As this signal derives from the entire 
population, these results require almost all sources to be significantly 
obscured. This in turn implies that these growing black holes are 
obscured along most lines of sight, as is also observed in a small subset 
of nearby objects'*. Such a high fraction of obscured sources at low 
luminosities is also observed at low redshifts'*. This large amount of 
obscuration along all directions absorbs virtually all ultraviolet 
photons from growing black holes. Thus, regardless of the amount 
of accretion in these sources, these active galaxies cannot have con- 
tributed to the early re-ionization of the Universe. Alternatively, it 
cannot be claimed that rapid and efficient supermassive black hole 
growth in the high-z Universe is implausible on the basis of any re- 
ionization constraints'®. If most of the high-redshift black hole growth 
is indeed obscured, as suggested by our work, several current con- 
straints on the lifetime and duty cycle of high-z accreting black holes 
need to be revisited and revised. 

Assuming that the X-ray emission is due to accretion onto the central 
black hole, the space density of mass accreted by black holes (in terms of 
solar masses per Mpc’) can be directly derived from the observed X-ray 
luminosity, as described in the Supplementary Information. Extrapola- 
tions of active galactic nuclei (AGN) luminosity functions’” measured at 
significantly lower redshifts, z<3, are consistent with the observed 
accreted black hole mass density at z> 6, as can be seen in Fig. 1. 
This directly leads to two further conclusions: the space density of 
low-luminosity (Lx <10“*ergs~') sources does not evolve signifi- 
cantly from z~ 1 to z~ 6-8, that is, over more than 5 billion years. 
Second, at higher luminosities, the extrapolation of lower-redshift AGN 
luminosity functions leads to an overestimate of the observed source 
density in optical surveys'*. This discrepancy can be resolved if the 
shape of the AGN luminosity function evolves strongly, in the sense 
that there are relatively fewer high-luminosity AGN at z > 6 in com- 
parison to the z< 3 population. Another possibility is that the number 
of obscured sources, relative to unobscured quasars, increases with 
redshift, such that most of the highly obscured systems are systematic- 
ally missed in these optical studies. This is strongly supported by 
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Figure 1 | Accreted black hole mass density as a function of redshift. Grey 
rectangle (top left) shows the range of values allowed by observations of z ~ 0 
galaxies**. The data points at z ~ 2 correspond to the values obtained from 
Chandra observations of X-ray detected AGN (dashed error bars) and luminous 
infrared galaxies” (solid error bars), while the measurement at z ~ 6-7 and the 
upper limits (downward arrows) at z = 7-9 show the results described in this 
work (red and black data points from the observed-frame soft and hard X-ray 
band observations, respectively). Vertical error bars, 1 s.d.; horizontal error bars, 
bin size. Black solid line, evolution of the accreted black hole mass density 
inferred from the extrapolation of AGN luminosity functions measured at lower 
redshifts’. We overplot the predictions of black hole and galaxy evolution 
models” for non-regulated growth of population-III star remnants (cyan line) 
and direct-collapse seeds (green line). The red and blue lines show the predicted 
black hole mass density if self-regulation is incorporated. 


observations of quasars at lower redshifts, z <3 (ref. 19). We cannot 
rule out either of these scenarios at present owing to the relatively 
small cosmological volume studied, in which the extremely rare 
high-luminosity AGN are absent. 

Our measurements and upper limits for the accreted black hole mass 
density up to z ~ 8.5 (~600 million years after the Big Bang) constrain 
the nature of black hole growth in the early Universe. Two critical 
issues for AGN and the supermassive black holes powering them are 
how the first black holes formed, and how they subsequently grew by 
accreting mass while shining as AGN. The strong local correlation 
between black hole mass and galaxy bulge mass observed at z ~ 0 (refs 
20, 21) is interpreted as evidence for self-regulated black hole growth 
and galaxy-black hole co-evolution’”’. This is currently the default 
assumption for most galaxy formation and evolution models”. 

The origin of the initial ‘seed’ black holes remains an unsolved 
problem at present. Two channels to form these seeds have been 
proposed: compact remnants of the first stars, the so-called population 
III stars”’, which generate seeds with masses ~10-10°M @; and the 
direct gravitational collapse of gas-rich pre-galactic disks, which leads to 
significantly more massive seeds with masses in the range 10°-10°M> 
(refs 26, 27). By modelling, we find that the masses of seeds that form 
from direct collapse are correlated to properties of the dark matter halo 
and hence properties of the galaxy that will assemble subsequently. 

To interpret our finding, we explore a theoretical framework for the 
cosmic evolution of supermassive black holes. We follow the forma- 
tion and evolution of black holes through dedicated Monte Carlo 
merger tree simulations. Each model is constructed by tracing the 
merger hierarchy of dark matter haloes in the mass range 10''-10'° 
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Mo backwards to z= 20, using an extended Press and Schechter 
algorithm’. The haloes are then seeded with black holes and their 
evolution is tracked forward to the present time. Following a major 
merger (defined as a merger between two haloes with mass ratio >0.1), 
supermassive black holes accrete efficiently an amount of mass that is 
set by a ‘self-regulated’ model (where the accreted mass scales with the 
fourth power of the host halo circular velocity and is normalized to 
reproduce the observed local correlation between supermassive black 
hole mass and velocity dispersion) or by an ‘unregulated’ model, where 
the supermassive black hole simply doubles in mass at each accretion 
episode. See Supplementary Information for additional details. 

Our observational results provide strong support for the existence of 
a correlation between supermassive black holes and their hosts out to 
the highest redshifts. In Fig. 1, we compare both unregulated and self- 
regulated black hole growth models with our observations, and find 
that physically motivated self-regulation growth models are highly 
favoured at all redshifts, even in the very early Universe. Unregulated 
models (for instance, those in which black holes just double in mass at 
each major merger) are strongly disfavoured by the data. This indicates 
that even in the first episodes of black hole growth, there is a fun- 
damental link between galaxy and black hole mass assembly. 

As shown in Fig. 1, once a standard prescription for self-regulation 
(as described above) is incorporated, both seed models are consistent 
with our current high-z observations. Detection of an unbiased popu- 
lation of sources at these early epochs is the one metric that we have in 
the foreseeable future to distinguish between these two scenarios for 
the origin of supermassive black holes in the Universe. In Fig. 2, we 
present the predicted cumulative source counts at z > 6 for the models 
studied here. On the basis of these models, ultra-deep X-ray and near- 
infrared surveys covering at least ~1 degree” are required to constrain 
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Figure 2 | Cumulative number of sources as a function of redshift for 
individual X-ray detections. This calculation assumes the X-ray flux limit of 
the 4 Ms CDF-S Chandra observations. The horizontal dotted line shows the 
number density required to individually detect one source in the area 
considered in this work at z > 7. Models are described in the Supplementary 
Information and labelled in the figure. Red, population III, Eddington fraction 
(feaa) = 1; blue, direct collapse, faq = 0.3; black, direct collapse, frag = 0.3, X2. 
We note that model population III, fzaa = 1, X2 has no detectable source. To 
distinguish between these models for early black hole formation will require a 
deep multiwavelength survey covering at least ~1 degree”. 
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the formation of the first black hole seeds. This will probably require the 
use of the next generation of space-based observatories, such as the 
James Webb Space Telescope and the International X-ray Observatory. 
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Zero outward flow velocity for plasma in a 
heliosheath transition layer 


Stamatios M. Krimigis'”, Edmond C. Roelof!, Robert B. Decker! & Matthew E. Hill! 


Voyager 1 has been in the reservoir of energetic ions and electrons 
that constitutes the heliosheath since it crossed the solar wind 
termination shock'* on 16 December 2004 at a distance from 
the Sun of 94 astronomical units (1 AU = 1.5 X 10° km). It is now 
~22 aU past the termination shock crossing*. The bulk velocity of 
the plasma in the radial-transverse plane has been determined® 
using measurements of the anisotropy of the convected energetic 
ion distribution’. Here we report that the radial component of the 
velocity has been decreasing almost linearly over the past three 
years, from ~70kms~! to ~Okms7', where it has remained for 
the past eight months. It now seems that Voyager 1 has entered a 
finite transition layer of zero-radial-velocity plasma flow, indi- 
cating that the spacecraft may be close to the heliopause, the border 
between the heliosheath and the interstellar plasma. The existence 
of a flow transition layer in the heliosheath contradicts current 
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predictions’—generally assumed by conceptual models—ofa sharp 
discontinuity at the heliopause. 

Figure la shows that there has been little change in the intensity of 
lower-energy ions at the position of Voyager 1 since April 2007. 
Indeed, a flat intensity profile and near-constant power-law energy 
spectrum (j o E ”) have been persistently observed at all ion energies 
from ~40 keV to 2 MeV, as is indicated by the constancy of the expo- 
nent y (Fig. 1a, diamonds), which has been computed at 53-85 keV but 
is typical of the entire energy range. We note for later reference the 
small but steep intensity decrease during the last three months of 2010 
observed for all low-energy ions (because of the nearly constant spec- 
tral shape). 

The heliosheath plasma flow velocity as estimated from ion intensity 
anisotropies measured using LECP*® is shown in Fig. 1b-d. In Fig. 1b, 
the radial component, Vp, undergoes a long, steady, nearly linear 
decline beginning at ~70kms ' and reaching ~Okms ° in April 
2010 with an abrupt change in slope to nearly zero. This remarkable 
near-zero radial velocity continued until at least February 2011 (the 
spacecraft velocity of 17kms * has been subtracted). We regard the 
abrupt change in (0V,/0t) in April 2010 (from negative to zero) as an 
indication that Voyager 1 had entered a finite ‘transition layer’, ter- 
minology suggested in ref. 9. The more conventional conception of the 
heliopause has been that of a surface (separatrix) between the heated 
solar wind plasma of the heliosheath and the cold interstellar plasma of 
the very local near-interstellar medium, such that the theoretically 


Figure 1 | Directional velocity measurements from Voyager 1 at the edge of 
the heliosphere. The Low Energy Charged Particle (LECP) instrument on 
Voyager 1 provides angular information via a mechanically stepped platform in 
eight 45° sectors. a-c, The velocity components of plasma flow (b, c) are 
calculated, using a method published elsewhere*’, from the directional 
measurements of the 53-85-keV ion intensities, whose scan-average intensity 
and power-law spectral index (’) are shown in a. Briefly, spacecraft-frame 
intensities, j(g), are represented by a second-order Fourier series in the scan 
angle, p (0<g< 2m), where j(y) = Ap + A,sin(g — 9) + Asin(y — 2). The 
parameters Ag, (Aj, @)) and (A2, gz), which generate the harmonic anisotropy 
amplitudes, €, and &, are determined by a least-squares fit® to intensities in 
sectors 1-7. For €5 < €), €; ~ 2(y + 1)V/y, implying that V= vé,/2(y + 1), 
where v is the velocity of the energetic particles. The velocity is resolved into 
components in the Voyager 1 R-T’ instrument scan plane, which is rotated 20° 
anticlockwise about the radial (+R) direction from the R-T plane in the 
conventional R-T-N heliographic polar coordinates in which the transverse 
direction (+T) is that of planetary motion around the Sun. The error bars (in 
a-c; smaller than the point sizes in a) are computed from the Poisson standard 
deviations in the directional rates. Plots of the adjacent channels (40-53 keV 
and 80-139 keV) show nearly identical results*. Such agreement independent of 
particle energy is an assurance that the energetic ion distribution is indeed being 
advected with the plasma bulk velocity. A least-squares fit to the last ten points 
of Vz (b, heavy dashed line) gives an average velocity of (Vp) = 1.0+2.4kms 
with an average slope of (0Vp/0t) = —6.1 + 12.1kms*' yr’, such that both 
mean and slope are statistically consistent with zero. In c, a least-squares fit to 
the transverse component, Vy’, is shown by the dashed line. d, The plasma flow 
velocity in the R-T’ plane represented as vectors, the head giving the velocity 
(Vpr’) and the tail being located at the observation time along the time axis. 
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Figure 2 | The heliosphere and its boundaries in the general direction of 
Voyager 1. a, Large-scale illustration depicting the solar wind radial flow inside 
the termination shock and the expected deflection of the heliosheath flow 
between the termination shock and the heliopause. The heliopause marks the 
(theoretical) boundary between the heliosheath plasma flow and the much 
denser, colder and slower flow of interstellar plasma being deflected around the 
heliosheath. V1, Voyager 1; V2, Voyager 2. b, Scale drawing of the unexpected 
transition layer that Voyager 1 has encountered within the heliosheath. For 
illustrative purposes, we have assigned a value (Vy ~ 40 km s ') to the 
(unmeasured) meridional velocity component throughout the layer. Voyager 1 
measurements have also revealed the thickness of the transition layer, a possible 
location for the heliopause and the consistency of the range of locations of the 


expected gradual reduction of Vp (due to the rotation of the flow so as 
to parallel the heliopause) would end with Vp asymptotically (not 
abruptly) going to zero, and then only at the heliopause itself. 

Figure 1c shows that the transverse velocity, Vy", fluctuated around a 
mean value of (V;) ~ —40kms_' for ~5yr until near the end of 
2010, when a trend towards zero began (Fig. lc, last four points). 
The swing of the velocity vector (Fig. 1d) from radial to transverse 
was completed by June 2010. Because Vp has remained near zero and 
Vy seems to be tending to zero at the end of 2010, any remaining flow 
will have to be in the unobserved meridional component, Vy. For an 
axisymmetric heliopause, it is expected that Vy-~ 30kms  ' for the 
meridional flow of the deflected distant interstellar plasma 
(V=26kms '). The observed flow pattern therefore could be con- 
sistent with Voyager 1 now being in the deflected interstellar plasma 
flow; that is, Voyager 1 may have crossed the heliopause, having passed 
through the transition layer (Vg = 0). 

We have interpreted the trends in Vp after July 2007 as a spatial 
phenomenon, as it is unlikely to be all or in part temporal. We cannot 
construct a reasonable scenario dominated by temporal variations for 
the monotonic time dependence of Vp, for example by ascribing it toa 
continually accelerating inward motion of the termination shock for 
almost 3 yr that brings the heliosheath at 115 AU to a dead stop for at 
least eight months throughout a region ~2.5 Au thick (the distance 
travelled by Voyager 1 in eight months). (All absolute distances are 
measured relative to the Sun.) The effects of any inward motion of the 
termination shock (such as to reduce Vg) would take at least 1 yr to 
propagate the ~20 Au to Voyager 1, and it is hard to imagine how it 
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heliopause estimated from energetic-neutral-atom (ENA) images made by 
NASA’s Cassini spacecraft. For the last, we used the simplified formula’® 
jena = oldr Njion ~ OLNjion» Which relates the ENA intensity, jun, at Cassini to 
the ion intensity, j,.,, along the line of sight (distance, L) between the 
termination shock and the heliopause through the image pixel containing 
Voyager 1. Heren = 0.1 cm * is the cold interstellar neutral density and o is the 
(energy-dependent) charge-exchange cross-section for proton—hydrogen 
collisions. We showed"’ that the ENA and ion spectra could be brought into 
agreement at Voyager 2 with a heliosheath thickness of L, = 54733 AU, whereas 
the same normalization procedure applied to Voyager 1 results in 


Ly =27736 Av. 


could produce the sharp change in (0Vp/0t) to zero observed in April 
2010. 

The spatial relationships in the transition layer of the plasma flow 
measurements along the radial trajectory of Voyager 1 (heliographic 
latitude, 36° N) are depicted in Fig. 2. If Vp remains zero, and the outer 
end of the flow transition layer is really the heliopause, then the 
Voyager 1 observations demand that the orientation of the heliopause 
is normal to the radial trajectory of Voyager 1 (regardless of the values 
of Vy and Vy’). The measured transition region (Vp ~ 0) extends 
from 113.5 AU to at least 115.7 AU. To relate this location to estimates 
of the thickness, L, of the heliosheath, we have estimated L from the 
ENA all-sky images from the Cassini Ion and Neutrals Camera’®", 
whose energy range overlaps that of LECP. Using the ENA data for 
Voyager 1 and the same method of analysis'’, we compute a 
heliosheath thickness of L; = 27} AU. Assuming that the termination 
shock is still where Voyager 1 crossed it, at 94 AU, the estimated radius 
of the heliopause along the trajectory of Voyager 1 should be 121 au, 
which is not inconsistent with our suggestion that Voyager 1 is actually 
crossing the heliopause if Vp and V7’ remain near zero beyond 116 au. 
This is why we called attention to the small (~27%) decrease in the 
0.04-2-MeV ions (mentioned above, in the discussion of Fig. 1a) that 
commenced just when V7 began increasing to zero. If the ion intensity 
decrease continues, we would interpret it as a draining away of the 
heliosheath’s energetic ion population into the downstream interstel- 
lar plasma flow. However, recent activity’* at Voyager 2 suggests that it 
may be a global heliospheric response to a changing magnetic config- 
uration at the Sun. 


©2011 Macmillan Publishers Limited. All rights reserved 


We must remember that Voyager 1, because it has no operational 
instruments that can measure them, is ‘blind’ to the particles that 
produce the suprathermal (~1-20-keV) pressure in the heliosheath 
that dominates the dynamics within the heliosheath and hence also the 
cross-heliopause force balance with the stress applied by the interstellar 
magnetic field’*. It is this ‘unseen’ population that must be producing 
the structure we measure as the transition layer, so there remains the 
possibility that the heliopause may be completely different from any- 
thing that has been suggested by contemporary theory”’*””. It would 
not be the first time that the Voyager observations have surprised us. 
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Emerging local Kondo screening and spatial 
coherence in the heavy-fermion metal YbRh)Siy 


S. Ernst', S. Kirchner’, C. Krellner', C. Geibel', G. Zwicknagl’, F. Steglich! & S. Wirth! 


The entanglement of quantum states is both a central concept in 
fundamental physics and a potential tool for realizing advanced 
materials and applications. The quantum superpositions under- 
lying entanglement are at the heart of the intricate interplay of 
localized spin states and itinerant electronic states that gives rise 
to the Kondo effect in certain dilute magnetic alloys’. In systems 
where the density of localized spin states is sufficiently high, they 
can no longer be treated as non-interacting; if they form a dense 
periodic array, a Kondo lattice may be established’. Such a Kondo 
lattice gives rise to the emergence of charge carriers with enhanced 
effective masses, but the precise nature of the coherent Kondo state 
responsible for the generation of these heavy fermions remains 
highly debated’*. Here we use atomic-resolution tunnelling spec- 
troscopy to investigate the low-energy excitations of a generic 
Kondo lattice system, YbRh2Si,. We find that the hybridization of 
the conduction electrons with the localized 4f electrons results in a 
decrease in the tunnelling conductance at the Fermi energy. In 
addition, we observe unambiguously the crystal-field excitations 
of the Yb** ions. A strongly temperature-dependent peak in the 
tunnelling conductance is attributed to the Fano resonance*’ result- 
ing from tunnelling into the coherent heavy-fermion states that 
emerge at low temperature. Taken together, these features reveal 
how quantum coherence develops in heavy 4felectron Kondo 
lattices. Our results demonstrate the efficiency of real-space elec- 
tronic structure imaging for the investigation of strong electronic 
correlations®’, specifically with respect to coherence phenomena, 
phase coexistence and quantum criticality. 

The heavy charge carriers of Kondo lattice systems form from a lattice 
of magnetic ions coupled to itinerant electrons. The strongly correlated 
quantum states in such heavy-fermion metals are characterized by 
enhanced effective charge-carrier masses, m*, and can give rise to a wide 
spectrum of emergent behaviour. These include various types of nor- 
mal-metallic paramagnetic, magnetically ordered and superconducting 
phases, the latter two being either competing or coexisting. The resulting 
energy scales are typically small enough to allow these systems to be 
‘tuned’ through different phases. Consequently, heavy-fermion metals 
are so far the only electronic systems in which the existence of a con- 
tinuous phase transition at zero temperature, a so-called quantum 
critical point, has been established®. 

However, the precise nature of the coherent Kondo state in Kondo 
lattice systems and the involved energy scales that are accessed on 
lowering the temperature, T, remain enigmatic”’. This is mainly due 
to the lack of reliable local probes at these low energy scales. For single 
(or a few) magnetic adatoms on noble-metal surfaces, the formation of 
a Kondo state has been addressed by scanning tunnelling micro- 
scopy’ * (STM). The hybridization of itinerant and localized degrees 
of freedom is reflected in the observation ofa so-called Fano resonance 
of the tunnelling conductance**. But to address the issue of evolving 
spatial coherence’, Kondo lattice systems need to be investigated. 
Recently, STM work on URu,Si, again revealed a Fano resonance 
and, additionally, a gap-like structure in the ‘hidden-order’ phase 


below a certain critical temperature, Ty = 17.5 K (refs 6, 7). In ref. 6 
this gap is related, by means of imaging heavy-quasiparticle inter- 
ference, to a rapid splitting of a light band into two new heavy-fermion 
bands, whereas in ref. 7 it is ascribed to a mean-field-type order para- 
meter. However, the concept of Kondo screening in actinide-based 
compounds is debated'* mainly because of the more delocalized char- 
acter of 5f electrons relative to 4f electrons”. 

We have therefore investigated a prototype Kondo lattice system, 
YbRh, Sip. The local character of the 4f electrons is demonstrated, for 
example, by the observation of a well-defined crystal-electric-field 
(CEF) splitting of the J = 7/2 Hund’s rule multiplet’®, where J denotes 
total angular momentum. Moreover, the Kondo screening involving 
the 4f moments and the conduction electrons has been well estab- 
lished’’ by the observation of a strong increase in m* (up to the order 
of 10° free-electron masses). For YbRh»Sis, the single-ion Kondo tem- 
perature of the thermally excited CEF states’* is, according to transport 
measurements, Tx = 80-100 K (refs 19, 20); and entropy” and thermo- 
power”’ results reveal a single-ion Kondo temperature of ~30 K for the 
CEF ground state’®. Exploring the formation of Kondo lattice coherence 
at a temperature T,, which is expected to occur below Tx, is one of the 
main aims of the present work. We restrict our measurements to 
T = 4.6K, which is well above the onset of antiferromagnetic order, 
at Ty = 70 mK (which competes with the Kondo effect), and is within 
a regime in which the heavy quasiparticles are still well defined, that is, 
safely away from the quantum critical point’’. YbRh»Si, is particularly 
well suited for STM investigations because it can be perfectly cleaved 
(see below) and because it is possible to compare the consequent scan- 
ning tunnelling spectroscopy (STS) results with renormalized band- 
structure calculations*' even at low energy. 

YbRh,Si, crystallizes in the tetragonal ThCr,Si, structure’? with 
lattice parameters a = 4.007 A and c = 9.858 A (Fig. la, inset). A typ- 
ical topography obtained at T = 4.6 K is presented in Fig. 1 and shows 
that the crystal has cleaved perpendicular to the crystallographic c axis. 
We observe the expected square arrangement of the surface atoms, 
with a spacing that corresponds to either an ytterbium- or silicon- 
terminated surface (cleavage of YbRh2Sin is expected to take place” 
between the ytterbium and silicon planes). In accord with the low 
residual resistivity of our samples (9 ~ 0.5 1Q.cm; see Supplemen- 
tary Information, section I), we find a very low density of defects. 
These defects cause changes in height of the order of 20 pm on top 
of the atomic corrugations of the regular lattice (Fig. 1d). An analysis of 
these defects, along with other observations outlined below, indicates 
that Fig. 1 represents a silicon-terminated surface; details of this ana- 
lysis, as well as an example of an ytterbium-terminated surface, are 
given in Supplementary Information, section II, and Supplementary 
Fig. 1, respectively. In the following, we focus on silicon-terminated 
surfaces, which are (as in pure URu,Si, (ref. 6)) more frequently 
observed. 

An overview of our experimental conductance data, g(V, T) = di/dV, 
where J and V respectively denote current and voltage, is presented in 
Fig. 2a. For clarity, only results for a few selected temperatures are shown, 


1Max Planck Institute for Chemical Physics of Solids, Néthnitzer Strasse 40, 01187 Dresden, Germany. *Max Planck Institute for the Physics of Complex Systems, Nothnitzer Strasse 38, 01187 Dresden, 
Germany. Institut fiir Mathematische Physik, TU Braunschweig, Mendelssohnstrasse 3, 38106 Braunschweig, Germany. 
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Figure 1 | Topography of a cleaved YbRh,Si, single crystal at 4.6 K. a, View 
(18 X 18 nm?) exemplifying the almost perfect surface in the a-b plane of the 
tetragonal crystal structure shown in the inset. There was no sign of any other 
structure in any of the six investigated single crystals. Gap voltage, 0.3 V; 
current set point, 0.6 nA. b, Magnified view of a (2 X 2 nm’). Shown are raw 
data except for a plane correction. ¢, Fast Fourier transform of topography in 
a, yielding a distance between the surface atoms of 4 A. d, Magnified scans along 
the lines through two defects indicated in a. Data (blue line) acquired almost 
perpendicular to the fast scan direction indicate excellent stability. 


with data at T= 14 and 30K offset and a small, linear background 
contribution subtracted. At T = 4.6 K, we clearly resolve three features 
with apparently different origins. First, g(V, T) has a V-shaped depres- 
sion around zero bias voltage, that is, for |V| < 10 mV. As shown 
below, this scale is related?’ to Tx and we henceforth refer to this feature 
as the ‘Kondo dip’. Second, we observe three peaks at around — 17, —27 
and —43 mV that are only weakly temperature dependent (Fig. 2a, 
arrows). The two more clearly developed peaks, at —27 and —43 mV, 
although broadened, have been resolved at temperatures of up to 40 K. 
Third, a more strongly temperature-dependent peak is located at 
V~—6mV (dashed line). We discuss these features separately in the 
following. 

The most clear-cut assignment is for the three peaks at —17, —27 
and —43mV. These are due to CEF excitations: inelastic neutron 
scattering measurements’ revealed such excitations at 17, 25 and 
43 meV, in excellent agreement with our STS results. Moreover, they 
are clearly reflected in the renormalized band-structure calculation’ 
(Fig. 3e). The observation of these peaks underlines that bulk prop- 
erties of YbRh,Si, are predominantly probed by our STS. Because the 
CEF excitations originate in ytterbium, this again makes an ytterbium- 
terminated surface unlikely and rather points to termination by silicon. 
We emphasize that CEF excitations in an ytterbium, that is, hole, 
system with occupied CEF levels are expected at negative voltages in 
STS (in contrast to cerium-based Kondo systems) and follow naturally 
from the Kondo picture in multi-orbital systems”. 

To investigate the temperature evolution of these CEF-related 
peaks, we focus on the most prominent one, at —43 mV, and make 
use of the surprising experimental finding of a nearly symmetric 
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Figure 2 | Tunnelling spectroscopy of YbRh,Siz. a, Overview of tunnelling 
conductance, g(V, T), at selected temperatures. Spectra at T= 14 and 30 K are 
offset for clarity. b, Spectra normalized to g(V = —80 mV, T). The dotted line 
represents a background (data at 30 K scaled by a factor & that accounts for the 
temperature evolution of the Kondo dip) for g(V, T = 4.6 K) on top of which the 
peak at approximately —6 mV (upper inset) is superposed. The dashed line 
shows data at 4.6 K ‘mirrored’ from positive bias. The lower inset shows a 
magnified view of the peak at —43 mV after subtraction of mirrored positive- 
bias data. c-e, Temperature dependences estimated from b. c, CEF excitations 
(arrows in a) exemplified for the —43-meV peak (red) and the Kondo lattice 
peak at —6 meV (pink; dashed line in a). Lines are guides to the eye. d, Relative 
depth, ho, of the Kondo dip at zero bias; the line is a logarithmic fit. e, Width of 
the peak at —6 mV; the line is a fit to equation (2). Locally resolved spectroscopy 
at T = 4.6K did not indicate any spatial variation of spectroscopic features 
(Supplementary Information, section III). Our spatial homogeneity is in accord 
with a silicon-terminated surface. Error bars, s.e. of the fitted parameters. 


g(V, T) profile. If the g(V, T) data measured for positive voltages are 
mirrored to negative voltages, they appear to form a background on 
top of which the peak resides (Supplementary Information, section V). 
This is exemplified for g(V, T = 4.6 K) by the dashed line in Fig. 2b. For 
T < 40 K, the peak at — 43 mV is clearly visible after this background 
subtraction, as shown in the lower inset of Fig. 2b for two represent- 
ative temperatures. For all our measured samples, the peak is located 
without any apparent temperature dependence at —43 + 2mV, an 
observation that is consistent with its ascription to CEF excitations. 
The temperature dependence of this peak’s height is shown in Fig. 2c. 

We now discuss the Kondo dip in our STS spectra around zero bias 
voltage. These conductance curves (Fig. 2a) are to be compared with 
those obtained on single magnetic Kondo impurities”"'”, where tunnel- 
ling not only into the conduction band but also into the localized state 
contributes to the STS signal. The coupling between the two tunnelling 
channels results in a typical Fano resonance*” at energy Ep and of width 
T: V) x (n+ q)7/(? +1). Here n = 2(eV— Ey)/I’ and the asym- 
metry parameter, q, relates the two tunnelling channels. YbRh)Si, 
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contains a dense lattice of magnetic ions, and simultaneous tunnelling 
processes are again possible**’. As a result, the experimentally 
obtained g(V, T) cannot be directly related to the local conduction (that 
is, spd) electron density of states (DOS), Pyp(é) = (1/7)Im(R|G(e)|R), 
where G(e) is the Green function for the bare conduction electron and R 
is the sample position closest to the STM tip. For a conceptual under- 
standing of the Kondo dip around zero bias and its temperature 
dependence, however, it suffices to consider tunnelling into the con- 
duction band as the dominant channel and ignore any direct tunnelling 
into the ytterbium 4f states (Supplementary Information, section IV). 
This approximation is justified by our focus on silicon-terminated sur- 
faces. Owing to the Kondo coupling between conduction electrons and 
the lattice of 4f moments, the STS signal still contains information on 
the ytterbium 4f DOS near the Fermi energy, Ep, and the Fano effect 
occurs. For a single Kondo impurity, this has been explicitly demon- 
strated previously**. For the heavy-fermion lattice, the difference 
between the experimentally obtained g(V, T) and prr(e) can be 
expressed in terms of the difference between the imaginary part of 
the fully renormalized single-electron Green function, (R|G(é)|R), 


and prr(é): 


g(V,T)x “Im (R|GU) —G(e)|R) + =1m(R|G(2)[R) 


fa) 
Here a denotes a sum over all ytterbium sites, r;, of the Kondo 
TypoTy 
lattice, with each site occurring at most once in such a way that each 
sequence {R, r; (i = 1, ..., 1), R} forms a self-avoiding loop. T(<) is the T 
matrix and is proportional to the ytterbium 4f DOS (Fig. 3b). We take 
two complementary approaches to addressing this problem 
(Supplementary Information, section IV). First, equation (1) can be 
evaluated approximately by considering only the first term of the 
sum. This approximation cannot describe the formation of a hybrid- 
ization gap ina Kondo lattice, yet it yields reasonable results at increased 
energies and temperatures. The calculated spectra are presented for 
several temperatures in Fig. 3c, and the resulting temperature evolution 
of the depth of the Kondo dip is shown in Fig. 3d. A comparison 
between these results and experimental data of Fig. 2d can be found 
in Supplementary Information, section IV. Second, a Fermi-liquid- 
based renormalized band-structure calculation”' for the lattice-coherent 
features, for example the incomplete hybridization gap and the full CEF 
scheme (implemented only incompletely in Fig. 3a), complements the 
first approach. This is shown in Fig. 3e, f. These considerations offer 
insight into the nature of the Kondo dip: loosely speaking, the on-site 
Kondo interaction—that is, the formation of local Kondo singlets— 
diminishes the conduction electron DOS available for tunnelling, which 


(R|G(e) G(e)|R) Grr(é) — Grr(é) 1) Causes a reduction in conductance below |V| ~ 10 mV, a scale that 
(1) roughly corresponds to Tx. 
_~< A Ge. G G, a(T(e))" The Kondo dip smoothly becomes shallower with increasing T 
~ 2 » Rr Minin nn(T(e)) (Fig. 2b), albeit substantially faster than is expected for thermal broad- 
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Figure 3 | Heavy-quasiparticle formation in a Kondo lattice. a, The 
renormalized quasiparticle bands”. The local ytterbium 4f levels well below Ep 
split up into four Kramers doublets, ef < Ey, m= 1,...,4. The doubly occupied 
states are located at ¢/” + Un m' < Ep, m, m' = 1, ..., 4. Owing to the 
hybridization between the conduction band and the 4f moments, a 
hybridization gap occurs near Ey in the quasiparticle bands at low 
temperatures. The hole character of the ytterbium 4f'° states requires Er to be 
located in the upper quasiparticle band. The red curve shows a band that, owing 
to symmetry, does not hybridize with the 4f states and remains unaffected. 

b, Local spectral function obtained within the multi-level, finite-Coulomb- 
repulsion non-crossing approximation (Supplementary Information, section 
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IV). The ytterbium hole character results in a Kondo resonance slightly below 
Er. ¢, The first term of equation (1) evaluated with the multi-level, finite- 
Coulomb-repulsion non-crossing approximation at several temperatures 
(increasing in the direction of the arrow), demonstrating the formation of the 
Kondo dip. d, Depth of the Kondo dip obtained from c. e, Quasiparticle DOS 
for YbRh,Si, within a renormalized band-structure calculation”'. The 
renormalized CEF positions, eps agree with our STS data. We note that the é/” 
are much smaller than the atomic energies, ¢". f, The region of e around Er: a 
small pseudogap has formed at around —2 meV. The formation of the 
incomplete hybridization gap is discussed in the Supplementary Information, 
section IV. 
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ening alone. Its depth, ho, is determined as described in Supplementary 
Information, section V, and ho(T) is presented in Fig. 2d. It can be 
reasonably well described by a logarithmic temperature dependence 
within a range that is centred on Tx = 80-100 K (Fig. 2d, dashed line). 
As already mentioned, such behaviour is consistent with the temper- 
ature dependence of both the thermopower”’ and the resistivity’’”° of 
YbRh,Si,, which show maxima at 80 and 100 K, respectively. Whereas 
the former maximum is a good measure of Tx, the latter results from a 
decreasing Kondo scattering of the charge carriers'* when, on cooling 
the system, the 4f electrons increasingly ‘condense’ into their CEF- 
derived Kramers doublet ground state’’*°. Hence, the observed tem- 
perature dependence of the Kondo dip supports our conjecture that 
this reduced g(V) for small Vis related to the on-site Kondo interaction 
between the 4f states and the conduction electrons. 

The nearly symmetric voltage dependence of the Kondo dip is com- 
patible with tunnelling into the conduction electron band being predom- 
inant'®’’. This predominance is consistent with a silicon-terminated 
surface and supports the corresponding approximation of equation (1). 

The lowest-energy peak (at V~ —6 mV; the third feature of the con- 
ductance data pointed out above) and, in particular, its temperature 
dependence can be accounted for neither by CEF excitations nor within 
a single-ion picture. Instead, both are naturally explained as a generic 
Kondo lattice effect??*°. Moreover, the renormalized band-structure cal- 
culation yields a small hybridization gap a few millielectronvolts below 
Ex (Fig. 3f). In real systems, a fully developed hybridization gap is often 
not expected, owing to the momentum (k) dependence of the hybrid- 
ization and possible magnetic correlations, in contrast to model calcula- 
tions*”*. Our theoretical considerations above indicate that the peak at 
about —6 meV signals the formation of an incomplete hybridization gap 
(Fig. 3a, f), in accordance with our renormalized band-structure calcula- 
tions for YbRh,Si, and general considerations for a Kondo lattice”””?. 

An analysis of the peak at V~ —6 mV is severely hampered by the 
nearby strong Kondo dip onto which this peak is superimposed. The 
procedure applied to decompose these two features is described in 
Supplementary Information, section V. The results for the g(V, 
T = 4.6K) data are presented in the upper inset of Fig. 2b. This peak 
as well as those with 4.6 K < T’< 30K can best be fitted by Gaussians 
(Fig. 2b, line in upper inset), which allows us to extract their heights, 
positions and widths. The peak height quickly decreases with temper- 
ature (Fig. 2c) and is estimated to be completely suppressed at 
T~ 27K, in good agreement with an estimate of 29K for the 
Kondo temperature of the CEF ground-state doublet'® in YbRh2Sin. 
Independently of temperature, the peak remains at —6 meV, an energy 
equivalent to a temperature of the order of, but slightly less than, the 
single-ion Ty, as expected for a hybridization gap. Rather than the 
exact position in energy of the Kondo lattice feature, which in a real 
material is a complex quantity, we analyse its width, w. In the single- 
impurity case, the zero-temperature width depends exponentially on 
the hybridization and local conduction DOS, reflecting the many-body 
character of the Kondo resonance. The increased broadening of the 
feature at finite temperature can, in contrast to the peak height, be 
accounted for by the relation*! 


w=2y/ (kp TY? +2(ka Tx)? (2) 


where kg is Boltzmann’s constant. For our case of a coherent Kondo 
lattice, a similar exponential dependence (with modified parameters) 
can be expected. By analogy to equation (2), we analyse our experi- 
mentally obtained w(T) in terms of an entirely phenomenological 
expression: w= 2,/ (kp Ty + 2(kg Ti)’. A fit (Fig. 2e, dashed line) 
yielded T, = 30 + 6K for the zero-temperature width of the lattice- 
coherent Kondo feature. We note that our analysis of w(T) does not 
require the hybridization gap to be fully developed. 

These results give insight into the thermal evolution of the Kondo 
effect in YbRh2Siy. The emerging local Kondo effect, which develops at 
Tx = 80-100K, averages over all involved CEF levels!®. On cooling, 
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excited CEF states become depopulated, and below T;, ~ 30 K only the 
lowest-lying CEF Kramers doublet is occupied. This allows for the 
development of a spatially coherent state that is manifested by an 
additional peak in g(V, T), at —6 mV, reflecting a ‘Kondo lattice res- 
onance’ related to an incomplete hybridization gap forming at around 
the single-ion Kondo temperature of the lowest-lying CEF doublet. 
Thus, by investigating the low-energy charge excitation spectrum we 
are able to illustrate the progressive Kondo screening and the emer- 
gence of coherent heavy-fermion states. This spectroscopic approach 
opens the way to a detailed microscopic study of the regime at even 
lower temperatures and will shed new light on, for example, quantum 
critical electronic excitations. 


METHODS SUMMARY 


We grew single crystals of YbRh2Si, by an indium-flux method. The platelet- 
shaped samples (thin dimension parallel the crystallographic c direction) were 
cleaved in situ under ultrahigh-vacuum conditions, at low temperature (~20 K) 
and parallel to the crystallographic a-b plane. STS was conducted after stabilizing 
each measurement temperature for several hours and by using a standard a.c. lock- 
in amplifier technique (typical bias modulation of 200 [LV, averaging for 200 ms at 
each measured bias voltage). We used a generalized multi-level non-crossing 
approximation to account roughly for the CEF splitting of the ytterbium 4f’* state 
in the calculation of the current-voltage characteristics. Lattice coherence effects 
were compared with renormalized band-structure calculations for YbRh)Sip. 
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Nanoporous molecular frameworks'” are important in applica- 


tions such as separation, storage and catalysis. Empirical rules exist 
for their assembly but it is still challenging to place and segregate 
functionality in three-dimensional porous solids in a predictable 
way. Indeed, recent studies of mixed crystalline frameworks suggest 
a preference for the statistical distribution of functionalities 
throughout the pores’ rather than, for example, the functional 
group localization found in the reactive sites of enzymes*. This is 
a potential limitation for ‘one-pot’ chemical syntheses of porous 
frameworks from simple starting materials. An alternative strategy 
is to prepare porous solids from synthetically preorganized molecu- 
lar pores’. In principle, functional organic pore modules could be 
covalently prefabricated and then assembled to produce materials 
with specific properties. However, this vision of mix-and-match 
assembly is far from being realized, not least because of the chal- 
lenge in reliably predicting three-dimensional structures for 
molecular crystals, which lack the strong directional bonding found 
in networks. Here we show that highly porous crystalline solids can 
be produced by mixing different organic cage modules that self- 
assemble by means of chiral recognition. The structures of the 
resulting materials can be predicted computationally'®””, allowing 
in silico materials design strategies'*. The constituent pore modules 
are synthesized in high yields on gram scales in a one-step reaction. 
Assembly of the porous co-crystals is as simple as combining the 
modules in solution and removing the solvent. In some cases, the 
chiral recognition between modules can be exploited to produce 
porous organic nanoparticles. We show that the method is valid 
for four different cage modules and can in principle be generalized 
in a computationally predictable manner based on a lock-and-key 
assembly between modules. 

A basic tool in the synthesis of functional extended solids is the 
ability to combine different chemical entities in a controlled and 
modular fashion. This has been demonstrated for structurally related, 
or ‘isoreticular’, porous metal-organic frameworks (MOFs)’. MOFs 
can be prepared with more than one chemical function, either by direct 
reaction of mixed precursors’ or by post-synthetic modification”. 
Although both MOFs and zeolites can comprise fused, compartmen- 
talized cages, it is still generally challenging to segregate structural units 
in a programmed and predictable way. 

Most nanoporous networks are synthesized in ‘one-pot’ chemical 
reactions where all of the precursors are mixed together simulta- 
neously'’. The three-dimensional network structure arises from 
self-assembly of the components. By contrast, natural products are 
synthesized in stepwise reaction sequences where isolable molecular 
intermediates are elaborated and combined to create more complex 
structures. An analogous, supramolecular strategy”° for porous organic 
solids would be to preorganize larger chemical subunits, or pore 
modules, before assembling the extended crystal. This approach 
requires building blocks, or tectons*’”, that are self-assembling, 


prefabricated molecular analogues of the secondary building units in 
networks such as MOFs’*’. To be broadly useful, the pore modules 
should pack together in predictable ways. Individual modules could 
then be designed to incorporate desirable chemical functionalities, 
either by chemical derivatization’ or by physical encapsulation within 
the molecular pores”*. Mixing different functional modules might pro- 
duce porous solids with unusual properties, perhaps, for example, by 
combining both acid- and base-containing cage modules within the 
same porous solid along with vacant, flow-through pores. In practice, 
however, many components of this strategy are currently missing. 
Although a large number of porous molecular solids are known”"”’, 
as highlighted in a recent review”, the rules that underpin their three- 
dimensional, non-covalent assembly are poorly understood. In this 
respect, the notion of ‘supramolecular synthesis”? is still unfulfilled. 
Levels of porosity in such molecular organic solids are also modest: 
until recently'’**°, Brunauer-Emmett-Teller specific surface areas, 
SAper, of less than 400m*g™' were typical“. Moreover, porous 
molecular solids could not be described as modular because almost 
all examples are single-component crystals. 

In this study, we report the production of porous organic molecular 
co-crystals, thus demonstrating a new modular assembly concept. We 
also describe computational methods to predict these crystal structures 
ab initio, greatly enhancing the long-term prospects for rational materials 
design'*. The materials were fabricated from combinations of the four 
pore modules shown in Fig. la. The first porous co-crystal was con- 
structed from two organic cages that we described previously”: cage 1 
and cage 3-R. Porosity is covalently prefabricated in the individual 
tetrahedral cage molecules such that each module has four triangular 
pore windows with diameters of around 6 A (Fig. 1a; see also scheme 1 in 
Supplementary Information). Each cage is just over 1 nm in size. Both 
cage modules have helical chirality: 1 comprises, in crystalline form, an 
equimolar mixture of the helical enantiomers 1-S and 1-R'*"*, whereas 
3-R is homochiral. Both cages are soluble in common solvents and can 
be simply mixed together in solution. Slow evaporation of an equimolar 
solution of 1 and 3-R did not lead to separate crystals of the individual 
modules, but rather to a new single-phase crystalline material. 
Remarkably, the material is a quasiracemic co-crystal”*, (1-S, 3-R). 
That is, it consists exclusively of the S helical enantiomer of 1 crystallized 
with 3-R (Fig. 1b). The apparent loss of the 1-R enantiomer, despite 
100% sample mass recovery from crystallization, is explained by vari- 
able-temperature 'H NMR measurements. This shows that the helical 
configurations of 1 interconvert rapidly in solution’* (Supplementary 
Fig. 2). The chirality of 1 is therefore dynamically resolved on crystal- 
lization with the homochiral cage, 3-R (Fig. 2a). Cage 1 is an ‘amphi- 
chiral’ module: it can also pair with 3-S to form the opposite 
quasiracemic co-crystal, (1-R, 3-S). As discussed below, however, this 
assembly strategy is not limited to dynamically chiral molecules. 

The crystal packing for (1-S, 3-R) is also shown in Fig. 2: the 1-S and 
3-R modules alternate in the crystal lattice in a face-centred cubic 
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Figure 1 | Modular assembly of porous organic cages. a, Structures of four 
organic cage modules (hydrogen atoms omitted for clarity). Cage 1 is shown as the S 
enantiomer but this module is amphichiral and can interconvert between the Rand S 
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Figure 2 | Window-to-window assembly results in porosity. a, Helical 
chirality in 1 is dynamically resolved by heterochiral co-crystallization with 
3-R. The schematic packing diagram for (1-S, 3-R) shows the centres of 
modules 1-S and 3-R as green and red spheres, respectively; orange spheres 
represent interstitial voids that are not connected to the diamondoid pore 
network, which is illustrated in yellow. E, is the activation energy for conversion 
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forms. b-d, Crystal structures of porous organic solids formed from these modules, 
with the Connolly surface shown in yellow (probe radius, 1.82 A): quasiracemic co- 
crystal (1-S, 3-R) (b), racemic crystal (3-S, 3-R) (c) and chiral crystal 5-R (d). 
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between 1-S and 1-R, as measured by variable-temperature NMR b, Nitrogen 
gas sorption analysis for crystals and co-crystals shows that pore volume and 
pore size can be varied systematically, as in isoreticular networks. Filled and 
open symbols represent sorption and desorption isotherms, respectively. po, 
atmospheric pressure. c, Scheme showing packing for various crystals and co- 
crystals. 
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arrangement, analogous to the ZnS ‘zinc blende’ structure. Each cage 
forms window-to-window interactions with four partner cages of the 
other type. The result is an interconnected diamondoid pore network. 
No polymorphs of pure 1 have been found that pack in a window-to- 
window fashion’*"*. Therefore, this packing mode is directed by the 
presence of the chiral co-module, 3-R. The window-to-window pack- 
ing arrangement creates permanent micropore channels in the co- 
crystal, which has a type-I nitrogen sorption isotherm at 77K 
(Fig. 2) and a specific surface area of SAppy = 437 m? gl. Like the 
other materials described here, the co-crystal is stable to desolvation 
and has good thermal stability, showing little weight loss until the onset 
of decomposition at 350 °C (Supplementary Fig. 7). 

The heterochiral pairing (1-S, 3-R) can be considered a directional 
tecton”’, comparable to reversible supramolecular interactions such as 
hydrogen bonding”! and the ‘sextuple aryl embrace”® that involves 
interlocking aryl rings. Density functional theory (DFT) calculations 
for isolated cage pairs indicate that the heterochiral window-to-win- 
dow interaction is 18 kJ mol” ' more stable than the equivalent homo- 
chiral interaction and much more stable than other hypothetical 
window-to-arene or arene-to-arene pairs that would lead to discon- 
nected pores (Supplementary Fig. 8). Lattice energy calculations con- 
firm that this heterochiral pairing preference carries over to the solid 
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Figure 3 | Three-dimensional cage assembly can be predicted 
computationally. a, Lattice energy rankings rationalize the heterochiral 
packing preference for the (1, 3) co-crystal (structure b is favoured over all 
hypothetical homochiral predicted structures), the racemic packing preference 
for cage 3 (structure d is favoured over c) and the chiral preference for 
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state and, more significantly, that the observed co-crystal structure can 
be predicted ab initio from the molecular formulae of the modules. 
Calculations using Monte Carlo simulated annealing to generate hypo- 
thetical (1-S, 3-R) crystal structures, followed by energy minimization 
using anisotropic atom-atom potentials*’’*, showed the observed 
packing mode for (1-S, 3-R) to be the global lattice energy minimum 
(Fig. 3), with good agreement between the ab initio predicted structure 
and the experimental single-crystal X-ray structure (Fig. 3b). The most 
stable hypothetical homochiral (1-R, 3-R) structure, which lacks 
window-to-window packing, was predicted to be 18.8kJ mol! less 
stable than the observed quasiracemate, (1-S, 3-R). These calculations 
therefore rationalize the preference for 1 to adopt the 1-S configuration 
in the co-crystal and to pack in a window-to-window fashion: that is, 
both the preferred chirality and the resultant porosity in the solid can 
be predicted ab initio. To verify the atom-—atom-potential lattice energy 
calculations, we performed solid-state DFT calculations on the 
observed quasiracemate and low-energy predicted homochiral struc- 
tures: these calculations confirmed the preference for heterochiral 
packing. 

This behaviour is not limited to the pairing of 1-S and 3-R. The 
enantiomers 3-S and 3-R also strongly prefer heterochiral window-to- 
window pairs and assemble in that fashion in a (3-S, 3-R) racemic 
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5 (structure e is favoured over all hypothetical racemates). b-e, Packing 
diagrams show the excellent fit between the calculated global-minimum 
structures (blue) and the experimentally determined structures (red). The 
predicted (1-S, 3-R) structure in b is slightly less symmetrical than the observed 
R3 space-group symmetry. The P1 unit cell is shown. 
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crystal (Fig. 1c) to give a porous solid with SAger = 873mg. In 
this case, the chirality in both modules is fixed rather than dynamic. As 
before, DFT simulations suggest a significant energy gain 
(19kJ mol!) in the formation of heterochiral dimers. Again, the 
crystal structure can be predicted ab initio. The experimentally 
observed racemic packing is the global energy minimum in the set 
of predicted crystal structures, and there is close agreement between 
the predicted and observed structures (Fig. 3d). These calculations also 
suggest a global preference for heterochiral packing modes rather than 
homochiral. A large energetic gain, of 32 kJ mol ', is calculated for the 
(3-S, 3-R) racemic crystal over the most stable predicted homochiral 
structure for 3-R. The global-minimum homochiral prediction also 
closely reproduces the observed structure for 3-R’ (Fig. 3c), which, 
unlike 1, can be obtained from enantiopure solutions because 3-R does 
not interconvert with its enantiomer. As for the (1-S, 3-R) co-crystal, 
the atom-atom lattice energy calculations were verified using periodic 
DFT calculations, which resulted in similar calculated energy differ- 
ences (Supplementary Table 1 and Supplementary Fig. 13). An ana- 
logous set of experimental observations and crystal structure 
predictions was obtained for a new cage module, 4-S, which has cyclo- 
pentane vertices rather than cyclohexane (Fig. 1a). This module forms 
a quasiracemic co-crystal, (4-S, 3-R), with SAprr = 980 m e In this 
case, the predicted global-energy-minimum crystal structure is an 
ordered version of the most probable site-disordered space-group- 
F4,32 structure, according to powder X-ray data (Supplementary 
Figs 12 and 15). By itself, 4-S does not pack in a window-to-window 
fashion (Supplementary Fig. 18). Hence, like (1-S, 3-R), this packing 
mode is directed by the partner module, 3-R. 

Not all systems favour heterochiral assembly and this, too, is pre- 
dictable from the calculated crystal energy landscape. A new module, 
5-R (Fig. la), was synthesized by the [4 + 6] cycloimination reaction 
between tri(4-formylphenyl)amine and the chiral diamine (R,R)-1,2- 
cyclopentanediamine. Cage 5-R is substantially larger than modules 1, 
3 and 4. For example, the tetrahedron inscribed by the centres of the 
triangular faces of 5-R has a volume that is 3.8 times larger than the 
comparable tetrahedron for cage 1 (Supplementary Fig. 24). In this 
case, lattice energy calculations suggest homochiral window-to- 
window packing as the clear global energy minimum, and this pre- 
dicted structure is observed experimentally for 5-R (Fig. 3e); again, 
DFT calculations agree broadly with the energy differences obtained 
by atom-atom-potential lattice energy calculations. To our knowledge, 
5-R (1,702 gmol ') is the largest organic molecule to be successfully 
tackled by crystal structure prediction’®. Numerous experiments invol- 
ving crystallization from mixtures of the modules 5-R and 5-S all led 
exclusively to homochiral crystals, in agreement with the predicted 
lattice energy preference over all hypothetical racemic structures. 
The crystalline solid 5-R has larger pores (compare Fig. 1b, c with 
Fig. 1d) and a greater pore volume (0.63cm’g ') than any of the 
materials produced from the smaller cages 1, 3 and 4. The surface area 
of 5-R (SAper = 1333 m* g') exceeds all but one’ of the porous 
molecular (non-network) crystals reported so far? *™* and is compar- 
able with the first generation of covalent organic frameworks®. This 
larger cage shows that it is possible to prepare molecular organic 
crystals with bespoke pore sizes, analogous to the well-known series 
of isoreticular MOFs’ where pore size is defined by organic strut 
length. A future challenge will be to generalize this non-covalent 
assembly methodology. Non-identical molecules do not, as a general 
rule, co-crystallize, and it may be necessary to incorporate specific 
complementary functionality to induce co-crystallization of dissimilar 
modules. 

We have shown that porous cages can assemble in a modular fashion 
and, moreover, that the mode of assembly can be predicted accurately 
using lattice energy calculations. These particular structures are amen- 
able to computation because the directional interlocking of neighbour- 
ing cages leads to large energy differences between hypothetical 
structures. By contrast, most other organic molecules give rise to many 
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Figure 4 | Module assembly in solution can be used to produce porous 
nanoparticles. Mixing solutions of 3-S and 3-R leads to rapid precipitation of 
racemic octahedral nanocrystals of (3-S, 3-R) with an average diameter of 
130.nm (SAper = 873 m” g '). The micrographs show the same sample imaged 
at two different magnifications. 


distinct possible crystal structures that differ in energy by only a few 
kilojoules per mole’*””. Larger, conformationally flexible cage modules 
would be more challenging for these prediction methods, but sig- 
nificant recent advances have been made in dealing with molecular 
flexibility””°. Thus, the work presented here opens the way for in silico 
prediction of structure and properties for new candidate porous 
materials based solely on two-dimensional chemical sketches, thus 
allowing ‘design by computational selection’. 

The solution processability of the cage modules also means that the 
assembly approach can be extended to achieve structural control 
beyond the molecular length scale. For example, the (3-S, 3-R) 
racemate is at least ten times less soluble than the homochiral modules, 
3-S and 3-R, and this leads to spontaneous precipitation on mixing of 
solutions of the two enantiomers (Supplementary Movie 1 and 
Supplementary Fig. 27). Well-defined, porous (3-S, 3-R) nanocrystals 
are formed (Fig. 4), thereby translating intermolecular heterochiral 
tecton interactions into nanoscale morphology control. Porous nano- 
crystals might make particular applications of these solids possible in 
future, for example in chiral catalysis or separations. 


METHODS SUMMARY 


Synthesis of compounds. Cage 1, cage 4-R and cage 3-R were synthesized in a 
[4+ 6] cycloimination reaction involving triformylbenzene and the diamines 
ethylenediamine, (15,2S)-cyclopentanediamine and (1R,2R)-cyclohexanediamine, 
respectively, using an improved synthetic procedure which produces higher yields 
than that reported previously” (Supplementary Information). Cage 5-R was syn- 
thesized by the [4 + 6] cycloimination reaction between tri(4-formylphenyl)amine 
and (R,R)-1,2-cyclopentanediamine. Co-crystals were grown from equimolar 
solutions of the partner cage modules. Details of the crystallographic analysis, 
crystal data and gas sorption analysis are described in Supplementary Information. 
Crystal structure prediction. Crystal structures were generated in the most com- 
monly observed space groups using a Monte Carlo simulated annealing search 
method. The lowest-energy structures from the Monte Carlo search were then 
lattice-energy-minimized using anisotropic atom-atom potentials within the crys- 
tal structure modelling software DMACRYS*. Molecular geometries, generated 
by DFT single-molecule optimization, were treated as rigid throughout the pre- 
dictions. Further details are given in Supplementary Information. 

DFT calculations. Cage pairs and crystal structures were fully optimized in the 
mixed Gaussian and plane-wave code CP2K"", using the TZVP-MOLOPT basis set 
in combination with Geodecker-Teter-Hutter pseudopotentials and a plane-wave 
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cut-off of 400 Ry. Molecular and solid-state calculations used the BLYP and PBE 
functionals, respectively, both with Grimme’s D3 dispersion correction”. 
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Control of visual cortical signals by prefrontal 


dopamine 
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The prefrontal cortex is thought to modulate sensory signals in 
posterior cortices during top-down attention”, but little is known 
about the underlying neural circuitry. Experimental and clinical 
evidence indicate that prefrontal dopamine has an important role 
in cognitive functions’, acting predominantly through D1 recep- 
tors. Here we show that dopamine D1 receptors mediate prefrontal 
control of signals in the visual cortex of macaques (Macaca 
mulatta). We pharmacologically altered D1-receptor-mediated 
activity in the frontal eye field of the prefrontal cortex and mea- 
sured the effect on the responses of neurons in area V4 of the visual 
cortex. This manipulation was sufficient to enhance the mag- 
nitude, the orientation selectivity and the reliability of V4 visual 
responses to an extent comparable with the known effects of top- 
down attention. The enhancement of V4 signals was restricted to 
neurons with response fields overlapping the part of visual space 
affected by the D1 receptor manipulation. Altering either D1- or 
D2-receptor-mediated frontal eye field activity increased saccadic 
target selection but the D2 receptor manipulation did not enhance 
V4 signals. Our results identify a role for D1 receptors in mediating 
the control of visual cortical signals by the prefrontal cortex and 
suggest how processing in sensory areas could be altered in mental 
disorders involving prefrontal dopamine. 

Dopamine D1 receptors (D1Rs) are expressed by about one-quarter 
of all neurons in the prefrontal cortex and are localized primarily in 
superficial and deep layers*®. Microiontophoretic application of the 
selective D1R antagonist SCH23390’ at certain doses can increase the 
persistent, working-memory-related component of single-neuron 
activity in the dorsolateral prefrontal cortex**”. Given the role of the 
prefrontal cortex in visual attention’’, we hypothesized that D1Rs 
might also mediate the top-down control of visual cortical signals by 
the prefrontal cortex. If so, then changes in D1R-mediated prefrontal 
cortex activity might be sufficient to modulate signals in the posterior 
visual cortex, similar to the modulation observed during selective 
attention’. The prefrontal cortex’s influence on the visual cortex is 
achieved in part by the frontal eye field (FEF)’"*"’, an oculomotor area 
within the posterior prefrontal cortex. The FEF has a well-established 
role in saccadic target selection’’, but recent evidence also implicates 
this area in the control of spatial attention*’*"’. To test our hypothesis, 
we locally infused'® small volumes (0.5-1 il) of SCH23390 into sites in 
the FEF of macaques performing fixation and eye movement tasks 
(Fig. la, b and Supplementary Fig. 1). We measured the effects of 
the FEF infusion on target selection using a free-choice saccade task”. 
In this task, monkeys were rewarded for choosing between two saccadic 
targets, one located within the FEF response field and one in the opposite 
hemifield. In the same experiment, we recorded the visual responses of 
single neurons in area V4 during fixation. In particular, we recorded 
neurons with response fields that overlapped the FEF response field. 
Thus, we tested the effects of the DIR manipulation on both visual 
cortical signals and saccadic target selection. 

We found that altering D1R-mediated activity at FEF sites increased 
the tendency of monkeys to choose targets appearing within the FEF 
response field (Fig. 1b). In the free-choice task, the temporal onset of 


the two targets was systematically varied such that the FEF response 
field stimulus could appear earlier or later than the opposite stimulus. 
A monkey’s tendency to select the FEF response field target could then 
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Figure 1 | Local manipulation of D1R-mediated activity in the FEF during 
single-neuron electrophysiology in area V4. a, Lateral view of the macaque 
brain depicting the location of a recording microsyringe in the FEF and of 
recording sites in area V4. Bottom diagram shows saccades evoked via electrical 
microstimulation at the infusion site (red traces) and the response field (RF, 
green ellipse) of a recorded V4 neuron in an example experiment. b, Double- 
target saccade task used to measure the monkey’s tendency to make saccades to 
a target within the FEF response field versus one at an opposite location across 
varying temporal onset asynchronies. Positive asynchrony values denote earlier 
onset of FEF response field targets. Bottom plot shows the leftward shift in the 
PES, indicating more FEF response field choices, after infusion of SCH23390 
into an FEF site. c, Visual responses of a V4 neuron with a response field that 
overlapped the FEF response field, measured during passive fixation. The plot 
shows mean = s.e.m of visual responses to a bar stimulus presented at 
orthogonal orientations before (grey) and after (red) the infusion of SCH23390 
at the FEF site. 
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be measured as the temporal onset asynchrony required for an equal 
probability of selecting either stimulus; we termed this the point of 
equal selection (PES). In the example experiment shown, the monkey 
chose the FEF response field target as often as the opposite target when 
the former appeared 76 ms earlier (PES = 76). However, infusion of 
SCH23390 (0.85 il) into the FEF reduced the PES by 23 ms (binary 
logistic regression, P = 0.007), thereby increasing the proportion of 
FEF response field target choices. 

In the same experiment, we also measured the responses of V4 
neurons to oriented bars during fixation in a separate task (Fig. 1c 
and Supplementary Methods). We found that the increase in target 
selection after the SCH23390 infusion was accompanied by an 
enhanced V4 neuronal response to oriented bars appearing within 
the overlapping V4 and FEF response fields. The example neuron 
shown was selective for orientation: it responded more to the 45° than 
to the 135° bar stimulus (P< 10 °). After the infusion of SCH23390, 
there was a significant increase in the overall visual response of this 
neuron as well as a significant increase in the differential response to 
the two orientations (two-way analysis of variance, SCH23390 effect, 
P<10 *; SCH23390-orientation interaction, P< 10°). Thus, the 
local perturbation of D1R-mediated FEF activity not only caused the 
monkey to select FEF response field stimuli as saccade targets more 
frequently, it also led to enhanced and more selective visual responses 
of a V4 neuron representing the same part of space. 

We studied the visual responses of 37 V4 neurons with response 
fields that overlapped the response fields of FEF infusion sites. The 
average (mean + s.e.m.) distance between V4 response field and FEF 
response field centres was 0.71 + 0.07 degrees of visual angle (d.v.a.) 
(Fig. 2a). As with the example neuron, we measured the responses of all 
neurons to oriented bars appearing in their response field during a 1s 
fixation period (Fig. 2b). Before the onset of the visual stimulus, there 
was a significant elevation in baseline activity after the D1R manipula- 
tion (A baseline = 0.077 + 0.186, P = 0.030). In addition to the base- 
line increase, the visually driven response of V4 neurons was enhanced 
by 17% above the control response (A response = 0.121 + 0.054, 
P=0.018). We confirmed that the enhancement in the visual res- 
ponse was not due to systematic changes in eye position during stimu- 
lus presentation (Supplementary Fig. 2). The enhancement of the 
visual response was independently significant for both preferred 
(A preferred = 0.264 + 0.087; P= 0.004) and non-preferred stimuli 
(A non-preferred = 0.132 + 0.062; P= 0.032). There was also an 
increase in the response difference between the preferred and non- 
preferred orientations (Aresponse difference = 0.132 + 0.041; 
P= 0.004) (Supplementary Fig. 3), indicating an increase in orienta- 
tion selectivity. To measure selectivity more quantitatively, we used a 
receiver-operating characteristic (ROC) analysis to quantify the degree 
to which each neuron’s responses could be used to judge stimulus 
orientation (Fig. 2c). This analysis confirmed that V4 neurons were 
more orientation selective after changes in D1R-mediated FEF activity 
(A ROC area = 0.035 0.009, P< 10 3). The enhancement in the 
magnitude and selectivity of the V4 response was accompanied by a 
decrease in the trial-to-trial variability of visual responses. We mea- 
sured the variability of V4 responses across trials by computing the 
Fano factor, which is the variance in the spike count divided by 
its mean. We found that the Fano factor of V4 responses was reduced 
after the DIR manipulation (AFF = —0.105 + 0.045; P<10-°) 
(Fig. 2d and Supplementary Fig. 4). All three V4 effects were com- 
parable in magnitude to the known effects of top-down attention and 
consistent with a multiplicative increase in the gain of visual signals'*”” 
(Fig. 2e). 

The effect of the DIR manipulation on saccadic target selection 
was highly consistent across the two monkeys tested. In 21 double- 
target experiments, the PES was reduced in every case (Fig. 3a). The 
mean PES shifted in favour of the FEF response field stimulus by an 
average of 27ms (APES = —26.934 + 3.086, P< 10 °), signifi- 
cantly increasing the overall proportion of FEF response field choices 
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Figure 2 | Manipulation of D1R-mediated activity enhances V4 visual 
signals. a, Average vectors of saccades evoked at all FEF sites that overlapped 
V4 response fields (left panel). The distribution of distances between the 
endpoints of evoked saccades and the centres of overlapping V4 response fields 
for 37 V4 neurons is shown in the right panel. b-d, The mean normalized 
response magnitude (b), orientation selectivity (c) and response variability 
(Fano factor) (d) of V4 neurons before (grey) and after (red) microinfusion of 
SCH23390 into the FEF. Means + s.e.m. are shown within a 100-ms moving 
window measured during the 1-s response field stimulus presentation (top 
event plot). Histograms to the right of each response profile show the 
distributions of modulation indices for response magnitude (b), selectivity 
(c) and variability (d) across the population of neurons. e, Comparison of V4 
response modulation after the SCH23390 infusion for preferred and non- 
preferred response field stimuli. 


(chi-squared = 80.60, P< 10 *) and thus indicating that the DIR 
manipulation increased the monkeys’ tendency to target FEF response 
field stimuli. The increase in target selection was apparent across a 
range of drug dosages (Supplementary Fig. 5). In addition to the 
D1R manipulation, we tested the effects of the D2R agonist quinpirole. 
Previous studies using this drug found that it does not affect persistent 
activity but rather increases saccade-related activity within the dorso- 
lateral prefrontal cortex”. We found that local manipulation of D2R- 
mediated FEF activity, like the DIR manipulation, increased the selec- 
tion of FEF response field targets (Fig. 3a). The PES shifted by an 
average of 22ms (A PES = —21.993 + 6.758, P = 0.010), increasing 
the proportion of FEF response field choices (chi-squared = 13.86, 
P<10° ). Thus, the DIR- and D2R-mediated manipulations of FEF 
activity resulted in equivalent increases in saccadic target selection. 
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Figure 3 | Changes in saccadic target selection and V4 visual responses. 

a, Scatter plot shows the consistent increase in FEF response field target choices 
(decrease in PES) after manipulation of both D1R-mediated (circles) and D2R- 
mediated (triangles) FEF activity. For both drug effects, the increase in FEF 
response field target selection was constant across a range of control PES values; 
the slope in the linear fit did not differ significantly from unity in either case 


Despite the increase in target selection, manipulation of D2R- 
mediated activity in the FEF failed to enhance the responses of V4 
neurons. We found no significant effect on the visual response mag- 
nitude, orientation selectivity or response variability of V4 neurons 
after the D2R manipulation (A response = 0.001 + 0.048, P = 0.999; 
AROC area= —0.007 + 0.010, P=0.426; AFF = 0.037 + 0.052, 
P = 0.338; n = 15) (Fig. 3b). Moreover, the changes in these measures 
were all significantly different from the changes we observed after 
the DIR manipulation (A responsep2rR < A responsep;p, P = 0.045; 
A selectivitypor < A  selectivitypip, P= 0.011; AFFp2p >AFFpiz, 
P=0.019). Thus, the equivalent effects of DIR and D2R manipula- 
tions on saccadic target selection were accompanied by contrasting 
effects in V4, with the enhancement of visual signals being specific 
to D1R-mediated activity. We also found that this enhancement was 
confined to V4 neurons with response fields that overlapped the FEF 
response field. For V4 neurons with response fields that did not overlap 
the FEF response field (mean distance between V4 response field and 
FEF response field = 9.00 + 0.86 d.v.a.; n = 15), we found no signifi- 
cant effect of the DIR manipulation on response magnitude 
(A response = —0.028 + 0.087, P= 0.9780), orientation selectivity 
(AROC area= —0.017+ 0.010, P=0.187) or the Fano factor 
(A FF = 0.010 + 0.043, P = 0.688). Of note, the changes in these measures 
were all significantly different from the changes observed in neurons 
with overlapping response fields (Aresponseyon-overlap < A responseovertaps 
P=0.044; A selectivity non-overlap < A selectivityovertap» P= 0.007; 
A FE yon-overlap > A FF overlap» P = 0.034) (Fig. 3b). Thus, the enhance- 
ment in visual cortical signalling produced by manipulation of D1R- 
mediated FEF activity was spatially specific. 

Wealso tested the effect of complete inactivation of FEF sites on the 
responses of V4 neurons with overlapping response fields. Previous 
studies have shown that local inactivation of the FEF disrupts saccadic 
target selection and impairs attention'”*!. We therefore wondered if 
inactivation could reduce the components of V4 responses that were 
enhanced by the D1R manipulation. We locally inactivated FEF sites 
using the GABA, (y-aminobutyric acid subtype A) receptor agonist 
muscimol. Unlike the sparsely expressed D1Rs, GABA, receptors are 
expressed by all neurons in all cortical layers*’. As in previous studies, 
local inactivation of FEF sites with muscimol decreased the targeting of 
FEF response field stimuli. It also significantly reduced V4 orientation 
selectivity (AROC area=—0.030+0.011, P=0.003; n= 33). 
However, the inactivation did not change the response magnitude or 
variability of V4 neurons (A response = 0.016 + 0.061, P= 0.809; 
AFF = —0,002 + 0.023, P= 0.921) (Fig. 3b). Thus, in contrast to 
the DIR manipulation which altered all three components of V4 
activity, complete inactivation altered only one. All three inactiva- 
tion effects were significantly different from the DIR effects 
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(D1R: slope = 0.96, P = 0.552; D2R: slope = 0.97, P = 0.502). b, Changes in 
response magnitude, orientation selectivity and response variability (Fano 
factor) after each drug manipulation. Changes shown are mean differences 
from pre-infusion values. Error bars denote s.e.m.; *, P< 0.05; **, P< 0.01; 
*** P<0.001. 


(A responseynuscimol < A responsepip; P = 0.024; A selectivity muscimol < 
Aselectivitypiz, P<10°; AFFmuscimot> AFFpir, P= 0.007). 
Although the reduction in orientation selectivity is consistent with 
previous electrical microstimulation studies’* and with the effects of 
inactivation on orientation discrimination”, the lack of a reduction in 
response magnitude may seem inconsistent. However, we suggest that 
this difference is due to variation between experimental paradigms 
(Supplementary Discussion). Finally, we tested for any effect of 
vehicle (saline) infusion into the FEF. The infusion of saline failed to 
change the response magnitude, selectivity or variability of V4 
neurons (A response = 0.018 + 0.048, P= 0.380; AROC area= 
—0.010 0.013, P=0.569; AFF=-—0.035+0.061, P=0.179; 
n= 12) (Fig. 3b). All three measures were significantly different from 
the DIR _ effects (Aresponsegaline << Aresponsep;p, P= 0.045; 
A selectivity.aline < A selectivityp;pz, P= 0.013; A FFgajine > A FFpir 
P =0.009). 

Our results identify prefrontal D1Rs as a component of the neural 
circuitry controlling signals in the visual cortex. Manipulation of D1R- 
mediated FEF activity was sufficient to enhance the magnitude, 
reliability and visual selectivity of neuronal responses in area V4, three 
known effects of visual attention. The observed enhancement might 
account for the benefits in visually guided behaviour that accompany 
attentional deployment (Supplementary Fig. 6), although a causal link 
between attentional modulation of visual cortical signals and visual 
perception remains to be established. We have demonstrated that 
visual representations in posterior areas can be altered merely by 
changes in dopamine tone in the prefrontal cortex. Given the complex 
effects of dopamine through D1Rs, one might predict that at 
‘optimum’ dopamine levels’, optimal top-down control of visual cor- 
tical signals would be achieved. 

The circuitry underlying top-down control of the visual cortex 
probably involves several different neuromodulators” and an array 
of different brain structures. Our results show that this circuitry 
involves prefrontal dopamine acting via D1Rs. In the dorsolateral 
prefrontal cortex, dopamine D1Rs are thought to modulate recurrent 
glutamatergic connections, thereby influencing activity related to 
working memory in this area*”*. This study shows that D1Rs contri- 
bute to the FEF’s control of visual signals by an analogous mechanism, 
namely by modulating long-range, recurrent connections between the 
FEF and the visual cortex (Supplementary Fig. 7). Because FEF neu- 
rons in the superficial layer are reciprocally connected with neurons in 
V4’’’, dopaminergic modulation of these connections via D1Rs in the 
superficial layer would be expected to mediate the FEF’s control of V4 
signals. The specificity of V4 effects to D1Rs, rather than D2Rs, might 
be explained by the relative absence of D2Rs in superficial layers of 
the prefrontal cortex**®. The equivalent effects of DIR and D2R 
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manipulations on target selection might be explained by the presence 
of both receptor subtypes in infragranular layers of the cortex*°, where 
layer-V FEF neurons project to the superior colliculus”’. 

Impairments in saccadic control are prominent among the impair- 
ments exhibited in attention deficit/hyperactivity disorder (ADHD)”*. 
The observed influence of prefrontal D1Rs on saccadic target selection 
and visual cortical signals, combined with their known influence on 
persistent activity, may explain the behavioural links between saccadic 
control, attention and working memory” and the coincidence of their 
corresponding impairments in ADHD”. 


METHODS SUMMARY 


The effects of pharmacological perturbation of FEF activity on target selection and 
the visual responses of V4 neurons were studied in three macaques (Macaca 
mulatta) performing fixation and eye movement tasks (Supplementary 
Methods). All experimental procedures were in accordance with the National 
Institutes of Health guide for the care and use of laboratory animals and with 
the Society for Neuroscience guidelines and policies. They were also approved 
by the Stanford University animal care and use committee. Eye position was 
monitored with a scleral search coil. In each experiment, we infused small volumes 
of drug into sites in the FEF through a surgically implanted titanium chamber 
overlying the arcuate sulcus using a custom-made recording microinjectrode. We 
identified FEF sites by eliciting short-latency, fixed-vector saccadic eye movements 
with trains (50-100 ms) of biphasic current pulses (=50 pA; 250 Hz; 0.25 ms 
duration). In the same experiment, recordings from V4 neurons were made 
through a chamber overlying the prelunate gyrus. Response fields of V4 neurons 
were all located in the lower quadrant of the contralateral hemifield (<12° eccent- 
ricity). The position of the FEF microinjectrode was adjusted so that the saccade 
elicited by FEF microstimulation shifted the monkey’s gaze either to within the V4 
response field (overlapping) or far outside it (non-overlapping). 
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Forces between clustered stereocilia minimize 
friction in the ear on a subnanometre scale 


Andrei S. Kozlov', Johannes Baumgart”, Thomas Risler**°, Corstiaen P. C. Versteegh’® & A. J. Hudspeth! 


The detection of sound begins when energy derived from an acoustic 
stimulus deflects the hair bundles on top of hair cells’. As hair 
bundles move, the viscous friction between stereocilia and the sur- 
rounding liquid poses a fundamental physical challenge to the ear’s 
high sensitivity and sharp frequency selectivity. Part of the solution 
to this problem lies in the active process that uses energy for 
frequency-selective sound amplification®*. Here we demonstrate 
that a complementary part of the solution involves the fluid- 
structure interaction between the liquid within the hair bundle 
and the stereocilia. Using force measurement on a dynamically 
scaled model, finite-element analysis, analytical estimation of hydro- 
dynamic forces, stochastic simulation and high-resolution inter- 
ferometric measurement of hair bundles, we characterize the origin 
and magnitude of the forces between individual stereocilia during 
small hair-bundle deflections. We find that the close apposition of 
stereocilia effectively immobilizes the liquid between them, which 
reduces the drag and suppresses the relative squeezing but not the 
sliding mode of stereociliary motion. The obliquely oriented tip 
links couple the mechanotransduction channels to this least dissi- 
pative coherent mode, whereas the elastic horizontal top connectors 
that stabilize the structure further reduce the drag. As measured 
from the distortion products associated with channel gating at 
physiological stimulation amplitudes of tens of nanometres, the 
balance of viscous and elastic forces in a hair bundle permits a 
relative mode of motion between adjacent stereocilia that encom- 
passes only a fraction of a nanometre. A combination of high- 
resolution experiments and detailed numerical modelling of 
fluid-structure interactions reveals the physical principles behind 
the basic structural features of hair bundles and shows quantita- 
tively how these organelles are adapted to the needs of sensitive 
mechanotransduction. 

A hair bundle is a microscopic array of quasi-rigid, cylindrical 
stereocilia separated by small gaps filled with viscous endolymph. 
Like an array of organ pipes, the stereocilia vary monotonically in 
length across the hair bundle (Supplementary Information section 
1). The tip of each short stereocilium is attached to the side of the 
longest adjacent stereocilium by a tip link, the tension in which con- 
trols the opening and closing of transduction channels. Adjacent 
stereocilia are also interconnected along all three hexagonal axes by 
horizontal top connectors. At the tall edge of the bundle in many 
species stands a single kinocilium, the process to which mechanical 
stimuli are applied and that is ligated to the adjacent stereocilia by 
kinociliary links. 

When a solid object such as a hair bundle moves through a viscous 
fluid, the interplay between viscosity and inertia produces a spatial 
gradient of fluid velocity and the shear between successive layers of 
fluid causes friction*. The characteristic decay length of the shear 
waves created by an oscillating body scales as \/n/(wp), in which 7 
is the fluid’s dynamic viscosity, p is its density and « is the angular 
frequency of motion’. Because this length scale greatly exceeds the 


distance between stereocilia, viscous forces can couple all motions 
within a hair bundle. On the other hand, the pivotal stiffness of indi- 
vidual stereociliary rootlets opposes deflection. The viscous forces in the 
endolymph, elastic forces in the stereociliary pivots and links, and (at 
high frequencies) inertial forces associated with the liquid and stereo- 
ciliary masses together determine all the motions within a bundle. 

Although stereociliary motion can be measured directly with an inter- 
ferometer (Supplementary Information section 1), a qualitative appre- 
ciation of the liquid’s movement can be obtained from the associated drag. 
When a fluid moves between nearby cylinders with axes perpendicular to 
the flow, the drag on each cylinder exceeds that on an identical cylinder 
placed alone in a flow with the same average velocity. At a Reynolds 
number well below one, this effect is strong and long-range®*’. One might 
therefore expect a drag coefficient for a hair bundle several hundred times 
that ofan isolated stereocilium. Instead, the measured values are of similar 
magnitude: for six interferometric measurements in each case, the drag 
coefficient for a single stereocilium is 16 + 5nNs m |, whereas that for 
an entire bundle lacking tip links is only 30 + 13nNsm_'. Because we 
determined the drag coefficient for hair bundles that lacked tip links and 
displayed coherent Brownian motion, the latter value is about a quarter of 
that typically reported in the literature’. We note that these values 
resemble those calculated for geometrical solids of similar dimensions 
pivoting at their bases and evaluated at their tips’!°: 14nNsm_' for a 
cylinder of the size of a stereocilium and 29 nN s m_* for a hemi-ellipsoid 
with the dimensions of a hair bundle. The small difference between the 
drag coefficients for a single stereocilium and for an entire hair bundle 
reveals the striking advantage that grouping stereocilia in a tightly packed 
array offers to the auditory system. 

Although stereocilia may slide past each other quite easily, large 
forces are required to squeeze them together or separate them. To 
estimate these forces, we constructed a macroscopic model of a hair 
bundle with the surrounding liquid, preserving the scaling between the 
physical quantities of importance (Supplementary Information section 
2). A simplified model of a bullfrog’s hair bundle enlarged 12,000 times 
was placed in a 2.2% solution of methylcellulose, which is 5,000 times 
as viscous as water. A single stereocilium was pulled at speeds of 0.015- 
1.11mms ' while the frictional force was measured. After rescaling 
the time, length and mass values to those of a biological hair bundle, we 
estimated the drag coefficient for the small-gap separation of a single 
stereocilium to be 1,000-10,000 nN sm /, which is several hundred 
times that for the movement of an isolated stereocilium. This order-of- 
magnitude demonstration confirmed that very large frictional forces 
oppose the squeezing motion, indicating the importance of hydrodyn- 
amics in the coupling of stereocilia. 

Elastic forces become dominant in the low-frequency regime and 
inertial forces become dominant in the high-frequency regime of hair- 
bundle motion. To quantify the forces as a function of frequency, we 
developed a finite-element model in which we could manipulate the 
mechanical properties of the elastic links while explicitly representing 
the liquid around and between the stereocilia (Supplementary 


1Howard Hughes Medical Institute and Laboratory of Sensory Neuroscience, The Rockefeller University, 1230 York Avenue, New York, New York 10065, USA. “Institute of Scientific Computing, Department 
of Mathematics, Technische Universitat Dresden, 01062 Dresden, Germany. 3Institut Curie, Centre de Recherche, F-75005 Paris, France. “UPMC Université Paris 06, UMR 168, F-75005 Paris, France. 
5CNRS, UMR 168, F-75005 Paris, France. °Experimental Zoology Group, Wageningen University, 6709 PG Wageningen, The Netherlands. 


376 | NATURE | VOL 474 | 16 JUNE 2011 


©2011 Macmillan Publishers Limited. All rights reserved 


Information section 3). The model has about 800,000 degrees of free- 
dom and is the first finite-element model to resolve the liquid motion 
in the gaps between stereocilia as well as in the outer boundary layer. 
The hair bundle is excited in the model by imposing an oscillatory 
displacement at varying frequencies on the kinocilium. 

First, we examined the model including only pivotal stiffness, hydro- 
dynamic drag and inertial mass (Supplementary Movie 1). At low 
frequencies, the viscous force is small and only the stimulated kinoci- 
lium and its tightly joined next neighbours move (Fig. 1a). The asso- 
ciated drag coefficient is about 5,000 nN s m~ 1 (Fig. 1b inset), a value in 
agreement with the result obtained with the scaled dynamical model. 
Because frictional forces increase linearly with frequency whereas 
elastic coupling remains constant for a given displacement, hydrodyn- 
amic coupling progressively entrains the whole hair bundle at higher 
frequencies (Fig. 1a). As the squeezing modes subside, the drag coef- 
ficient per stereocilium decreases, dropping by two orders of mag- 
nitude by 100 Hz (Fig. 1b). Above that frequency the entire bundle 
moves as a unit (Fig. la). 

Exciting the hair bundle and recording the linear responses at its 
opposite edges allowed us to compute the coherence of motion, a quantity 
that could be directly compared with interferometric measurements’ 
(Supplementary Information sections 1, 3 and 4). A hair bundle in 
the finite-element model without any interstereociliary linkages 
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displays a coherence exceeding 0.6 between 100 Hz and 5 kHz until 
inertia intervenes at higher frequencies (Fig. 1c). Adding horizontal top 
connectors with a stiffness of 20mNm ‘ to the model strongly 
increases the coherence, especially at low frequencies, and reduces 
the drag (Fig. 1b and c and Supplementary Movie 2). This value for 
the stiffness of top connectors was chosen such that the output coher- 
ence spectrum matched the experimental observations. It is corrobo- 
rated by the distortion-product experiments discussed below and 
accords with published experimental and modelling studies'**. 

Adding to the model tip links with a stiffness of 1 mN m', rather 
than top connectors, introduces some elastic coupling between the 
stereocilia of a given column (Fig. 1c and Supplementary Movie 3), 
but this coupling is inefficient. For low frequencies at which hydro- 
dynamic coupling is weak, only the excited column moves signifi- 
cantly. Moreover, because they are oriented obliquely, the tip links 
pull the stereocilia towards one another during positive deflections 
and allow them to separate during the complementary half-cycles. 
Both effects dramatically increase the drag, which originates almost 
entirely from the liquid within the hair bundle (Fig. 1b). 

Including both horizontal top connectors and tip links in the model 
increases the coherence for all frequencies below 5 kHz to 0.94 (Fig. 1c 
and Supplementary Movie 4), a value comparable to the experi- 
mental measurement. This model displays a low drag coefficient of 


10,000 


1,000 


0.001 0.01 0.1 1 


Se 
ee Se eae 
~~ 
Ae Se 2 rere 
1 40 400 1,000 10,000 


——¥ — No links 
——— Top connectors 

——A— Tip links 

—®— Top connectors and tip links 


700 70,000 


7,000 


1 kHz 'E 
Zz 
Phase = 
no 
90° 3 
Cc 
£ 
o 
180° ‘e 0° 
270° 


Figure 1 | Finite-element analysis of fluid-structure interactions in a hair 
bundle. a, Three top views illustrate the calculated motion of a hair bundle 
without elastic elements other than the kinociliary links and rootlets in 
response to sinusoidal deflections of the kinocilium, which lies at the right in 
each diagram. The colour scale (at the bottom) identifies successive positions 
through one cycle of stimulation with phase progressing counterclockwise. As 
the frequency increases, the stereocilia display a transition from weakly coupled 
to collective motion. The frequency dependence of the drag coefficient (b), the 
coherence (c) and the stiffness (d) are obtained from the model with four 
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configurations of the coupling between stereocilia: with only pivotal stiffness 
and hydrodynamic drag (blue downtriangles); adding horizontal top 
connectors with a stiffness of 20 mN m’ (purple squares); adding instead tip 
links with a stiffness of 1 mNm_' (orange uptriangles); and adding both top 
connectors and tip links (red circles). The drag coefficient in b was calculated in 
the presence of liquid both outside and inside the hair bundle (solid lines) as 
well as with the liquid inside only (dashed lines). The inset in b, which has axis 
labels identical to those of the main panel, displays the behaviour of two model 
configurations at extremely low frequencies. 
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85nNsm_’ that changes little with frequency (Fig. 1b), with the drag 
originating primarily from the external liquid but with some con- 
tribution from relative motions in the bundle, and a stiffness of 
450 uN m_* (Fig. 1d), similar to that reported for intact hair bundles*””. 
We note that at frequencies below 1 kHz the tip links strongly increase 
the hair bundle’s drag, whereas the top connectors largely suppress this 
effect. At higher frequencies, the liquid alone provides such a strong 
coupling that the tip links do not affect the drag significantly. This 
frequency-dependent transition between elastic and viscous regimes 
might explain why some high-frequency hair cells, in particular mam- 
malian inner hair cells, apparently lack top connectors"®. 

We next explored the fluid-structure interactions in an analytically 
tractable and intrinsically stochastic model that allowed us to generate 
time series that could be compared directly with experiments (Sup- 
plementary Information section 5). Unlike the harmonic single-point 
excitation in the finite-element model, the movement in this instance 
was caused by the coupling of each individual stereocilium to the ther- 
mal bath through a Langevin equation. The movements between each 
pair of stereocilia were derived from a basis set of elementary motions, 
for which we solved the Stefan—Reynolds equations within the lubrica- 
tion approximation (Supplementary Information section 6). 

Setting the elastic coupling to zero, we obtained a damping matrix with 
eigenvalues spanning about three orders of magnitude from the least 
damped collective modes to the most damped relative ones (Fig. 2a). 
This analysis shows that drag values that are low and comparable to those 
measured experimentally arise only when the common modes predom- 
inate. We next simulated stereociliary motions that matched the experi- 
mental records in time resolution and computed the associated 
coherence, which exceeded 0.95 up to 5 kHz (Fig. 2b). Changing the 
elastic coupling in the model revealed its importance at low frequencies, 
whereas viscous coupling intervened at higher frequencies. 

These results show that the magnitude of the relative motion in a 
hair bundle depends on the balance between hydrodynamic and elastic 
forces. That hair bundles undergoing Brownian motion display a high 
coherence" indicates that the relative mode is very small, which makes 
it difficult to detect and quantify. We therefore devised an experiment 
in which hair bundles were stimulated at physiological amplitudes to 
evoke channel gating and cause intrinsic oscillations at the combina- 
tion frequencies (Supplementary Information section 7). Because the 
gating of each mechanotransduction channel in a hair bundle changes 
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Figure 2 | Fluid-structure interactions in a stochastic model. a, Calculations 
for a model with only pivotal stiffness and hydrodynamic drag, two degrees of 
freedom per stereocilium, and no kinocilium yield 122 eigenmodes, of which 
four representative examples are shown. The eigenmodes of the damping 
matrix progress from a collective mode with a low-drag eigenvalue to a relative 
mode that is a thousand times as dissipative. The reported eigenvalues are 
expressed in multiples of the smallest one. b, The calculated coherence of 
motion for a hair bundle with a top-connector stiffness of 20 mN m_* (orange) 
or 20 tN m_! (blue) illustrates the importance of elastic linkages at low 
frequencies and of viscous coupling at high frequencies. 
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the force in the associated tip link’’, it must cause a relative motion of 
the interconnected stereocilia that is balanced by the frictional drag 
and elastic linkages. Blocking the distortion products at one edge of the 
hair bundle while measuring the relative motion at the opposite edge 
allowed us to isolate and quantify the amount of splay between adja- 
cent stereocilia during small deflections, assess the forces at play and 
compare the results with our model. 

In agreement with a previous report’, using a flexible glass probe 
attached to a hair bundle’s tall edge to stimulate it at two frequencies 
evoked distortion products at several combination frequencies. These 
distortion products were robust at both edges of a hair bundle and dis- 
appeared when the tip links were disrupted by 1,2-bis(o-aminophenoxy) 
ethane-N,N,N’,N’ -tetraacetic acid (BAPTA), confirming that the distor- 
tion was caused by the gating of mechanotransduction channels. We then 
used a stiff glass probe to stimulate the long edge of the hair bundle. The 
rigid probe in this key experiment prevented any internally generated 
motion from contaminating the signal at the tall edge, which therefore 
consisted purely of the two excitation frequencies. With this constraint, 
the distortion products were significant only at the free, short edge of the 
hair bundle (Fig. 3a). 

We related the distortion of the short-edge motion to the linear 
displacement by a power series. The inverse of the quadratic term of 
this fit was 0.14+0.12p1m (n=8) for the flexible probe and 
1.6 = 0.9 um (n = 4) for the stiff probe. The distortions were therefore 
reduced to less than a tenth of their original value when the bundle’s 
tall edge was forced to follow the stimulus signal exactly. The finite- 
element model with viscous coupling replicated this effect, with the 
top-connector stiffness determined independently from the other 
experimental data. The remaining distortions revealed that the relative 
movement between adjacent stereocilia was less than a nanometre, 
only a few times the size of a water molecule (Fig. 3a). 

To confirm further the correspondence between experiment and 
modelling, we tested the prediction that removal of the horizontal 
top connectors should diminish the coherence (Fig. 1c) and increase 
the overall drag (Figs 1b and 2a). We placed hair bundles in a Ca**- 
free, iso-osmotic solution of mannitol having the same viscosity as 
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Figure 3 | Experimental verification of model predictions. a, Power spectra 
reveal that exciting a hair bundle with a stiff glass probe at two frequencies 

(f, = 90 Hz and f; = 115 Hz) generates distortion products marked by peaks of 
power-spectral density (PSD) at the second harmonics (2f, = 180 Hz and 

2f2 = 230 Hz) and at the combination frequency (f; + f; = 205 Hz). Because the 
stiff probe suppresses internally generated movements at the tall edge (right panel), 
the distortion products are present only at the free short edge (left panel). The 
presence of distortion products directly demonstrates the relative mode of motion 
within the array. The schematic diagram of a hair bundle in the inset indicates the 
stimulating probe attached at the bundle’s top and the positions of the red and 
green laser spots used in the interferometric measurements. b, The coherence in 
perilymph (orange) declines appreciably in the presence of mannitol (blue), which 
disrupts the horizontal top connectors. The mean values are accompanied by 95% 
confidence intervals in light orange and light blue, respectively. 
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saline solution but a lower ionic strength. This medium has been 
reported to remove the top connectors’’, and we verified the treat- 
ment’s effect by transmission electron microscopy (Supplementary 
Information section 8). After 20 min of treatment, the top connectors 
were overstretched or broken, but not entirely absent (data not 
shown). Some elastic coupling thus persisted in mannitol. As our 
model predicted, the procedure decoupled the stereocilia and 
increased the drag, although quantitatively the effect was variable from 
cell to cell, presumably because of heterogeneity in the residual top 
connectors. For the same six cells in both conditions, the coherence 
between 100 Hz and 5 kHz declined from 0.96 + 0.01 in perilymph to 
0.83 + 0.12 in mannitol (Fig. 3b). At the same time, the drag coefficient 
in mannitol increased to 99 + 63nNsm_ '. Together with the close 
match between the coherence values in the experiment and in the 
models, this and the results above confirm the accuracy of the numerical 
models and indicate that they capture the essential physics of the fluid— 
structure interactions in a hair bundle. 

In conclusion, because all stereocilia and the liquid between them 
move in unison over the whole auditory spectrum, with the relative 
motions apparent only on a sub-nanometre scale, most stereocilia 
inside the hair bundle are shielded from the external liquid and experi- 
ence little viscous drag. Although viscous forces might be thought to 
impair sensitivity and frequency selectivity, the hair bundle’s structure 
actually minimizes energy dissipation, making it easier for the active 
process to keep the ear tuned. The tight clustering of stereocilia even 
transforms liquid viscosity into an asset by using it as a simple means of 
activating numerous mechanosensitive ion channels in concert. 


METHODS SUMMARY 


The methods used in this study are described in the Supplementary Information. 
Force measurements on a scaled hair-bundle model respected the physiological 
character of the liquid flow. The finite-element method provided approximate 
solutions to partial differential equations reflecting the hair bundle’s geometry. 
The small amplitudes of motion allowed the elimination of nonlinear terms. The 
velocity variable of the liquid was replaced with the time derivative of the displace- 
ment; fluid pressure was approximated by linear shape functions and the displace- 
ments of liquid and solid were approximated by quadratic functions. The 
hydrodynamic forces between stereocilia were estimated analytically by solving 
the Stefan—-Reynolds equations under the lubrication approximation, which is 
valid when the gaps between adjacent stereocilia are much smaller than their 
diameter. Stochastic simulations based on these results were performed for a 
system of linearly coupled dynamic variables, following a Langevin description 
with Gaussian white noise at room temperature. The integration procedure was 
validated by choosing time steps small enough to ensure that the results were 
independent of the increment. The robustness of our conclusions was investigated 
by a detailed parameter-variation study. We tested the effects of inertia and of the 
estimated top-connector stiffness and confirmed the validity of our conclusions 
for mammalian hair bundles. 

Dual-beam differential interferometry was used to record stereociliary motions 
with sub-nanometre spatial and sub-millisecond temporal resolution. Fourier ana- 
lysis of the records was performed with the multitaper method to obtain coherence 
spectra as well as stiffness and drag coefficients. Distortion products were evoked by 
stimulating hair bundles with calibrated glass probes. These results were used to 
verify the predictions of the numerical model and to measure the relative mode of 
motion between stereocilia directly. Transmission and scanning electron micro- 
scopy was performed by standard techniques with minor modifications. 
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Transcriptomic analysis of autistic brain reveals 
convergent molecular pathology 
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Autism spectrum disorder (ASD) is a common, highly heritable 
neurodevelopmental condition characterized by marked genetic 
heterogeneity’ *. Thus, a fundamental question is whether autism 
represents an aetiologically heterogeneous disorder in which the 
myriad genetic or environmental risk factors perturb common 
underlying molecular pathways in the brain*. Here, we demon- 
strate consistent differences in transcriptome organization 
between autistic and normal brain by gene co-expression network 
analysis. Remarkably, regional patterns of gene expression that 
typically distinguish frontal and temporal cortex are significantly 
attenuated in the ASD brain, suggesting abnormalities in cortical 
patterning. We further identify discrete modules of co-expressed 
genes associated with autism: a neuronal module enriched for 
known autism susceptibility genes, including the neuronal specific 
splicing factor A2BP1 (also known as FOX1), and a module 
enriched for immune genes and glial markers. Using high- 
throughput RNA sequencing we demonstrate dysregulated splic- 
ing of A2BPI-dependent alternative exons in the ASD brain. 
Moreover, using a published autism genome-wide association 
study (GWAS) data set, we show that the neuronal module is 
enriched for genetically associated variants, providing independ- 
ent support for the causal involvement of these genes in autism. In 
contrast, the immune-glial module showed no enrichment for 
autism GWAS signals, indicating a non-genetic aetiology for this 
process. Collectively, our results provide strong evidence for con- 
vergent molecular abnormalities in ASD, and implicate transcrip- 
tional and splicing dysregulation as underlying mechanisms of 
neuronal dysfunction in this disorder. 

We analysed post-mortem brain tissue samples from 19 autism 
cases and 17 controls from the Autism Tissue Project and the 
Harvard brain bank (Supplementary Table 1) using Illumina micro- 
arrays. For each individual, we profiled three regions previously impli- 
cated in autism*: superior temporal gyrus (STG, also known as 
Brodmann’s area (BA) 41/42), prefrontal cortex (BA9) and cerebellar 
vermis. After filtering for high-quality array data (Methods), we 
retained 58 cortex samples (29 autism, 29 controls) and 21 cerebellum 
samples (11 autism, 10 controls) for further analysis (see Methods for 
detailed sample description). We identified 444 genes showing signifi- 
cant expression changes in autism cortex samples (DS1, Fig. 1b), and 
only 2 genes were differentially expressed between the autism and 
control groups in cerebellum (Methods), indicating that gene expres- 
sion changes associated with autism were more pronounced in the 
cerebral cortex, which became the focus of further analysis (Sup- 
plementary Table 2). There was no significant difference in age, post 
mortem interval (PMI), or RNA integrity numbers (RIN) between 
autism and control cortex samples (Supplementary Fig. 1, Methods). 

Supervised hierarchical clustering based on the top 200 differentially 
expressed genes showed distinct clustering of the majority of autism 
cortex samples (Fig. 1a), including one case that was simultaneously 


found to have a 15q duplication (Methods, Supplementary Table 1), 
which is known to cause 1% of ASD®. Cortex samples from ten of the 
cases coalesced in a single tight-clustering branch of the dendrogram. 
Clustering was independent of age, sex, RIN, PMI, co-morbidity of 
seizures, or medication (Fig. la and Supplementary Fig. 2c). It is 
interesting to note that the two ASD cases that cluster with controls 
(Fig. 1a) are the least severe cases, as assessed by global functioning 
(Supplementary Table 12). We observed a highly significant overlap 
between differentially expressed genes in frontal and temporal cortex 
(P= 10 “4; Fig. 1b), supporting the robustness of the data and indi- 
cating that the autism-specific expression changes are consistent 
across these cortical areas. We also validated a cross section of the 
differentially expressed genes by quantitative reverse transcription 
PCR (RT-PCR) and confirmed microarray-predicted changes in 
83% of the genes tested (Methods, Supplementary Fig. 2b). Gene 
ontology enrichment analysis (Methods) showed that the 209 genes 
downregulated in autistic cortex were enriched for gene ontology 
categories related to synaptic function, whereas the upregulated genes 
(N = 235) showed enrichment for gene ontology categories implicated 
in immune and inflammatory response (Supplementary Table 3). 

To test whether these findings were replicable, and to further validate 
the results in an independent data set, we obtained tissue from an 
additional frontal cortex region (BA44/45) from nine ASD cases and 
five controls (DS2; Supplementary Table 4). Three of the cases and all of 
the controls used for validation were independent from our initial 
cohort. Ninety-seven genes were differentially expressed in BA44/45 
in DS2, and 81 of these were also differentially expressed in our initial 
cohort (P = 1.2 X 10 73, hypergeometric test; Fig. 1b, c). Remarkably, 
the direction of expression differences between autism and controls was 
the same as in the initial cohort for all but 2 of the 81 overlapping 
differentially expressed probes. Hierarchical clustering of DS2 samples 
based on either the top 200 genes differentially expressed in the initial 
cohort or the 81 overlapping genes showed distinct separation of cases 
from controls (Supplementary Fig. 6). In addition, comparison of these 
differentially expressed results with another, smaller study of the STG in 
ASD’, revealed significant consistency at the level of differentially 
expressed genes, including downregulation of DLX1 and AHII 
(Supplementary Table 5). Thus, differential expression analysis pro- 
duced robust and highly reproducible results, warranting further 
refined analysis. 

We next applied weighted-gene co-expression network analysis 
(WGCNA)*” to integrate the expression differences observed between 
autistic and control cerebral cortex into a higher order, systems level 
context. We first asked whether there are global differences in the 
organization of the brain transcriptome between autistic and control 
brain by constructing separate co-expression networks for the autism 
and control groups (Methods). The control brain network showed 
high similarity with the previously described human brain co-expression 
networks (Supplementary Table 7), consistent with the existence of 
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Figure 1 | Gene expression changes in autism cerebral cortex a, Heat map of 
top 200 genes differentially expressed between autism and control cortex 
samples. Scaled expression values are colour-coded according to the legend on 
the left. The dendrogram depicts hierarchical clustering based on the top 200 
differentially expressed genes. The top bar (A/C) indicates the disease status: 
red, autism; black, control. The bottom bars show additional variables for each 
sample: sex (grey, male; black, female), brain area (black, temporal; grey, 
frontal), co-morbidity of seizures (green, autism case with seizure disorder; red, 
autism case without seizure disorder; black, control), age, RNA integrity 
number (RIN) and post mortem interval (PMI). BA, Brodmann’s area. The 
corresponding scale for quantitative variables is shown on the left. b, Top, Venn 
diagram depicting the overlap between genes differentially expressed in frontal 
and temporal cortex. Bottom, Venn diagram describing the overlap between 
genes differentially expressed in the initial cohort (DS1) and the replication 
cohort (DS2). Differential expression in the initial cohort was assessed at an 
FDR < 0.05 and fold change >1.3. The statistical criteria were relaxed to 


robust modules of co-expressed genes related to specific cell types and 
biological functions®. Similarly, the majority (87%) of the autism 
modules showed significant overlap with the previously described 
human brain modules (Supplementary Table 6), indicating that many 
features reflecting the general organization of the autism brain tran- 
scriptome are consistent with that of the normal human brain. 

The expression levels of each module were summarized by the first 
principal component (the module eigengene), and were used to assess 
whether modules are related to clinical phenotypes or other experi- 
mental variables, such as brain region. Two of the control module 
eigengenes (cM6, cM13) showed significant differences (P < 0.05) 
between the two cortical regions as expected, whereas none of the 
ASD modules showed any differences between frontal and temporal 
cortex. This led us to explore the hypothesis that the normal molecular 
distinctions between the two cortical regions tested were altered in 
ASD compared with controls. Remarkably, whereas 174 genes were 
differentially expressed between control BA9 and BA41 (false discovery 
rate (FDR) < 1%), none of the genes were differentially expressed in the 
same regional comparison among the ASD cases. This was not simply 
an issue of statistical thresholds, as relaxing the statistical criteria for 
differential expression to an FDR of 5% identified over 500 differentially 


P<0.05 for the replication data set because it involved fewer samples. 

c, Expression fold changes for all genes differentially expressed in the initial 
cohort are plotted on the x-axis against the fold changes for the same genes in 
the replication cohort on the y-axis. Green, genes downregulated in the autism 
group in both data sets; red, genes upregulated in the autism group in both data 
sets; grey, genes with opposite direction of variation in the two data sets. 
Horizontal lines show fold change threshold for significance. d, Diagram 
depicting the number of genes showing significant expression differences 
between frontal and temporal cortex in control samples (top) and autism 
samples (bottom) at FDR < 0.05 (left). The top 20 genes differentially 
expressed between frontal and temporal cortex in control samples (right). All of 
the genes shown are also differentially expressed between frontal and temporal 
cortex in fetal midgestation brain’®, but show no significant expression 
differences between frontal and temporal cortex in autism. The horizontal bars 
depict P values for differential expression between frontal and temporal cortex 
in the autism and control groups. 


expressed genes in controls, and only 8 in ASD brains, confirming the 
large difference observed in regional cortical differential gene expres- 
sion between ASD cases and controls (Fig. 1d, Methods). Analysis of 
differential expression from a data set’ of gene expression in devel- 
oping fetal human brain showed a highly significant (P = 5.8 X 10°”) 
overlap of differentially expressed genes with those found in controls in 
this study, independently confirming that these genes differentiate 
normal temporal and frontal lobes. We evaluated the homogeneity of 
gene expression variance across the autism and control groups using 
Bartlett’s test (Methods) which indicated that increased variance was 
not the major factor responsible for the striking difference in regional 
gene expression between ASD and controls (Supplementary Fig. 7 and 
Supplementary Data). 

These data suggest that typical regional differences, many of which 
are observed during fetal development”, are attenuated in frontal and 
temporal lobe in autism brain, pointing to abnormal developmental 
patterning as a potential pathophysiological driver in ASD. This is 
especially interesting in light of a recent anatomical study of five cases 
with adult autism which demonstrated a reduction in typical ultra- 
structural differences between three frontal cortical regions in autism". 
Together, these independent studies provide both molecular and 
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structural evidence suggesting a relative diminution of cortical regional 
identity in autism. 

To identify discrete groups of co-expressed genes showing transcrip- 
tional differences between autism and controls, we constructed a co- 
expression network using the entire data set, composed of both autism 
and control samples (Methods). As previously shown for complex 
diseases'*'* co-expression networks allow analysis of gene expression 
variation related to multiple disease-related and genetic traits. We 
assessed module eigengene relationship to autism disease status, age, 
gender, cause of death, co-morbidity of seizures, family history of 
psychiatric disease, and medication, providing a complementary 
assessment of these potential confounders to that performed in the 
standard differential expression analysis (Supplementary Table 9). 

The comparison between autism and control groups revealed two 
network modules whose eigengenes were highly correlated with disease 
status, and not any of the potential confounding variables (Supplemen- 
tary Table 9). We found that the top module (M12) showed highly 
significant enrichment for neuronal markers (Supplementary Table 9), 
and high overlap with two neuronal modules previously identified 
as part of the human brain transcriptional network®: a PVALB+ 


interneuron module and a module of genes involved in synaptic func- 
tion. The M12 eigengene was under-expressed in autism cases, indi- 
cating that genes in this module were downregulated in the autistic 
brain (Fig. 2). Consistent with the pathways identified to be down- 
regulated in autism by differential expression analysis (Supplemen- 
tary Table 3), the functional enrichment of M12 included the gene 
ontology categories involved in synaptic function, vesicular transport 
and neuronal projection. 

Remarkably, unlike differentially expressed genes, M12 showed sig- 
nificant overrepresentation of known autism susceptibility genes” 
(Supplementary Table 10; P=6.1 x 10%), including CADPS2, 
AHI, CNTNAP2, and SLC25A12, supporting the increased power of 
the network-based approach to identify disease-relevant transcrip- 
tional changes. A further advantage of network analysis over standard 
analysis of differential expression is that it allows one to infer the 
functional relevance of genes based on their network position’. The 
hubs of M12, that is, the genes with the highest rank of M12 member- 
ship®, were A2BP1, APBA2, SCAMP5, CNTNAP1, KLC2, and CHRM1 
(Supplementary Data). The first three of these genes have previously 
been implicated in autism’*”*, whereas the fourth is a homologue of 
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Figure 2 | Gene co-expression modules associated with autism a, d, Heat 
map of genes belonging to the co-expression module (top). Corresponding 
module eigengene values (y-axis) across samples (x-axis) (bottom). Red, 
autism; grey, controls. b, e, Visualization of the M12 and M16 modules, 
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respectively. The top 150 connections are shown for each module. Genes with 
the highest correlation with the module eigengene value (that is, intramodular 
hubs) are shown in larger size. c, f, Relevant gene ontology categories enriched 
in the M12 and M16 modules. 
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the autism susceptibility gene CNTNAP2 (ref. 17). We highlight the 
group of genes most strongly connected to the known ASD genes 
(Supplementary Fig. 5) and emphasize the downregulation of several 
interneuron markers, such as DLX1 and PVALB, as candidates for 
future genetic and pathologic investigations. 

The second module of co-expressed genes highly related to autism 
disease status, M16, was enriched for astrocyte markers and markers of 
activated microglia (Supplementary Table 9), as well as for genes 
belonging to immune and inflammatory gene ontology categories 
(Fig. 2). This module, which was upregulated in ASD brain, showed 
significant similarity to two modules identified in previous studies of 
normal human brain*: an astrocyte module and a microglial module. 
Consistent with this functional annotation, two of the hubs of the M16 
module were known astrocyte markers (ADFP, also known as PLIN2, 
and IFITM2). 

One of the hubs of the M12 module was A2BP1, a neural- and muscle- 
specific alternative splicing regulator’* and the only splicing factor 
previously implicated in ASD'®. Because A2BP1 was downregulated 
in several ASD cases (Supplementary Fig. 8), this observation provided 
a unique opportunity to identify potential disease-relevant A2BP1 
targets. Whereas A2BP1-regulated alternative exons have been pre- 
dicted genome-wide’’, few genes have been experimentally validated 
as A2BP1 targets”’. To identify potential A2BP1-dependent differential 
splicing events in ASD brain, we performed high-throughput RNA 
sequencing (RNA-Seq) on three autism samples with significant 
downregulation of A2BP1 (average fold change by quantitative RT- 
PCR = 5.9) and three control samples with average A2BP1 levels. We 
identified 212 significant alternative splicing events (Supplemen- 
tary Data). Among these, 36 had been defined’? as predicted 
targets of A2BP1/2, which represents a highly significant overlap 
(36/176, P=2.2X10'°). In addition, five previously validated 
A2BP1 targets showed evidence of alternative splicing, four of which 
(ATP5C1, ATP2B1, GRINI and MEF2C) were confirmed as having 
differential splicing between ASD samples with low A2BP1 expression 
and control samples, indicating that we were able to identify a high 
proportion of the expected A2BP1-dependent differential splicing 
events. We also observe that alternative exons with increased skipping 
in ASD relative to control cases are significantly enriched for A2BP1 
motifs in adjacent, downstream intronic sequences (P = 1.09 x 10 ”, 
Fisher’s exact test), consistent with previous data’. 

The top gene ontology categories enriched among ASD differential 
splicing genes highly overlapped with the gene ontology categories 
found to be enriched in the M12 module (Fig. 3b). In addition, 
A2BP1 target genes showed enrichment for actin-binding proteins 
and genes involved in cytoskeleton reorganization (Fig. 3b). Among 
top predicted A2BP1-dependent differential splicing events (Fig. 3a) 
are CAMK2G, which also belongs to the M12 module, as well as 
NRCAM and GRINI. The latter are proteins involved in synaptogenesis, 
in which allelic variants have been associated with autism and schizo- 
phrenia, respectively”. 

RT-PCR assays confirmed a high proportion (85%) of the tested 
differential splicing changes involving predicted A2BP1 targets (Sup- 
plementary Fig. 8). We further tested the differential splicing events 
validated by RT-PCR in three independent ASD cases with decreased 
A2BP1 levels and confirmed the predicted changes in alternative splic- 
ing (Supplementary Fig. 8), indicating that the observed differential 
splicing events are indeed associated with reduced A2BP1 levels, rather 
than due to inter-individual variability. The RNA-Seq data thus pro- 
vides validation of the functional groups of genes identified by co- 
expression analysis, and evidence for a convergence of transcriptional 
and alternative-splicing abnormalities in the synaptic and signalling 
pathogenesis of ASD. 

To test whether our findings are more generalizable, and determine 
whether the autism-associated transcriptional differences observed 
are likely to be causal, versus collateral effects or environmentally- 
induced changes, we tested whether our co-expression modules or 
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Figure 3 | A2BP1-dependent differential splicing events a, Top A2BP1- 
specific differential splicing events. Differential splicing events showing the most 
significant differences in alternative splicing between low-A2BP1 autism cases 
and controls as well as differential splicing differences consistent with the A2BP1 
binding site position. The horizontal axis depicts the percentage of transcripts 
including the alternative exon. Red, autism samples; black, control samples. 

b, Relevant gene ontology categories enriched in the set of genes containing 
exons differentially spliced between low-A2BP1 autism cases and controls. 


the differentially expressed genes show enrichment for autism genetic 
association signals. M12 showed highly significant enrichment for 
association signals (P =5 X 10 “), but neither M16 nor the list of 
differentially expressed genes showed such enrichment (Fig. 4). As a 
negative control, we performed the same set-enrichment analysis using 
two GWAS studies for non-psychiatric disease performed on the same 
genotyping platform: a genome-wide association for hair colour”, and 
a GWAS study of warfarin maintenance dose” finding no significant 
enrichment of the association signal (Fig. 4b, Supplementary Fig. 4). 
These results indicate that (1) M12 consists of a set of genes that are 
supported by independent lines of evidence to be causally involved in 
ASD pathophysiology, and (2) the upregulation of immune response 
genes in the autistic brain observed by us and others” has no evidence 
of a common genetic component. 

Our system-level analysis of the ASD brain transcriptome demon- 
strates the existence of convergent molecular abnormalities in ASD for 
the first time, providing a molecular neuropathological basis for the 
disease, whose genetic, epigenetic, or environmental aetiologies can 
now be directly explored. The genome-wide analysis performed here 
significantly extends previous findings implicating synaptic dysfunc- 
tion, as well as microglial and immune dysregulation in ASD® by pro- 
viding an unbiased systematic assessment of transcriptional alterations 
and their genetic basis. We show that the transcriptome changes 
observed in ASD brain converge with GWAS data in supporting the 
genetic basis of synaptic and neuronal signalling dysfunction in ASD, 
whereas immune changes have a less pronounced genetic component 
and thus are most likely either secondary phenomena or caused by 
environmental factors. Because immune molecules and cells such as 
microglia have a role in synaptic development and function”’, we specu- 
late that the observed immune upregulation may be related to abnormal 
ongoing plasticity in the ASD brain. The striking attenuation of gene 
expression differences observed here between frontal and temporal cor- 
texin ASD is likely to represent a defect of developmental patterning and 
provides a strong rationale for further studies to assess the pervasiveness 
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Figure 4 | GWAS set enrichment analysis a, GWAS set enrichment analysis 
using the discovery AGRE cohort from ref. 27. For each gene set (DEX, 
differentially expressed genes; M12 and M16) the null distribution of the 
enrichment score generated by 10,000 random permutations is shown (x-axis) 
and the enrichment score for the gene set is depicted by a red vertical line. A P 
value <0.01 was considered significant to correct for multiple comparisons. 
b, GWAS signal enrichment of differentially expressed genes and the autism- 
associated co-expression modules M12 and M16. Enrichment P values are 
shown for an autism GWAS data set (ref. 27, AGRE discovery cohort) as well as 
two control data sets consisting of GWAS studies of non-psychiatric traits: ref. 
23 (Negative control 1) and ref. 24 (Negative control 2). The red line marks the 
P value threshold for significance. 


of transcriptional patterning abnormalities across the ASD brain. We 
also demonstrate for the first time alterations in differential splicing 
associated with A2BP1 levels in the ASD brain, and show that many 
of the affected exons belong to genes involved in synaptic function. 
Finally, given current evidence of genetic overlap between ASD and 
other neurodevelopmental disorders including schizophrenia and atten- 
tion deficit hyperactivity disorder (ADHD), the data provide a new 
pathway-based framework from which to assess the enrichment of gen- 
etic association signals in other allied psychiatric disorders. 


METHODS SUMMARY 
Brain tissue. Post-mortem brain tissue was obtained from the Autism Tissue 
Project and the Harvard Brain Bank as well as the MRC London Brain bank for 
Neurodegenerative Disease. Detailed information on the autism cases included in 
this study is available in Methods. 
Microarrays and RNA-seq. Total RNA was extracted from 100 mg of tissue using 
a Qiagen miRNA kit according to the manufacturer’s protocol. Expression profiles 
were obtained using Illumina Ref8 v3 microarrays. RNA-seq was performed on 
the Illumina GAIIx, as per the manufacturer’s instructions. Further detailed 
information on data analysis is available in Methods. 

Full detailed Methods accompany this paper as Supplementary Information. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Brain tissue samples. Brain tissue samples from 19 autism cases and 17 controls 
were obtained from the Autism Tissue Project (ATP) and the Harvard Brain Bank. 
For each brain, tissue was obtained from frontal cortex (BA9), temporal cortex 
(BA41/42 or BA22) and cerebellum (vermis), with the exception of three controls 
lacking the cerebellum sample (Supplementary Table 1). For the replication 
experiment, frontal cortex tissue (BA44/45) from nine ASD cases and five controls 
were obtained from the ATP and MRC London Brain bank for Neurodegenerative 
Disease respectively (Supplementary Table 4). 

For all of the autism cases, clinical information is available upon request from 
ATP (http://www.autismtissueprogram.org), including the ADI-R diagnostic 
scores. Supplementary Table 12 contains a summary of clinical characteristics. 
Although autism cases with known genetic causes were not included in this study, 
one case with a chromosome 15q duplication was identified for AN17138 by high 
density small nucleotide polymorphism (SNP) arrays”* during the course of this 
study. The ATP cases were genotyped with high-density SNP arrays and with two 
exceptions all are Caucasians. The two Asian samples cluster with the other ASD 
cases in the current study, and are not distinguishable from the Caucasian cases 
based on clustering by gene expression. 

RNA extractions and microarrays. Total RNA was extracted from approximately 
100 mg of frozen tissue, using the Qiagen miRNA kit. RNA concentration was 
assessed by a NanoDrop spectrophotometer and RNA quality was measured using 
an Agilent Bioanalyzer. All RNA samples included in the expression analysis had an 
RNA integrity number (RIN) > 5. cDNA labelling and hybridizations on Illumina 
Ref8 v3 microarrays were performed according to the manufacturer’s protocol. 
Microarray data analysis. Microarray data analysis was performed using the R 
software and Bioconductor packages. Raw expression data were log, transformed 
and normalized by quantile normalization. Data quality control criteria included 
high inter-array correlation (Pearson correlation coefficients > 0.85) and detec- 
tion of outlier arrays based on mean inter-array correlation and hierarchical 
clustering. Probes were considered robustly expressed if the detection P value 
was <0.05 for at least half of the samples in the data set. Cortex samples (58: 29 
autism, 29 controls) and cerebellum samples (21: 11 autism, 10 controls) fulfilled 
all data quality control criteria. The 29 autism cortex samples included tissue from 
13 ASD cases with both frontal and temporal cortex and 3 ASD cases with frontal 
cortex only (in total 16 frontal cortex and 13 temporal cortex ASD samples). The 
29 autism control samples also included tissue from 13 controls with both frontal 
and temporal cortex and 3 controls with frontal cortex only (in total 16 frontal 
cortex and 13 temporal cortex control samples). 

Initially, all samples were normalized together to assess clustering by brain 

region. As expected, we observed distinct clustering of cortex and cerebellum 
samples (Supplementary Fig. 2A). For subsequent analyses, cortex samples and 
cerebellum samples were normalized and analysed separately. 
Differential expression. Differential expression was assessed using the SAM 
package (significance analysis of microarrays, http://www-stat.stanford.edu/ 
~tibs/SAM) and unless otherwise specified the significance threshold was 
FDR < 0.05 and fold changes > 1.3. Given that SAM is less sensitive in detecting 
differentially expressed genes for small number of samples, for the replication 
cohort, the differential expression was assessed by a linear regression method 
(Limma_ package, _http://bioconductor.org/packages/release/bioc/html/limma. 
html). Our results showing high degree of overlap between genes differentially 
expressed in the two data sets indicate that the expression differences observed are 
independent of the analysis methods. 

Because 444 genes were differentially expressed between autism and controls in 
cortex and only 2 genes were differentially expressed between the two groups in 
cerebellum (FDR < 0.05), we tested whether this difference was due to the smaller 
number of cerebellum samples, by relaxing the statistical criteria to FDR < 0.25. 
We found fewer than 10 differentially expressed genes in cerebellum using the 
relaxed statistical criteria, supporting the conclusion that genome-wide expression 
changes in autism were more pronounced in cerebral cortex than in cerebellum. 

To account for the fact that the control group of DS1 contained samples from a 
single female whereas the autism DS1 group included four females, we eliminated 
from differential expression analysis all probes showing evidence of gender- 
specific gene expression (1 = 70). We also applied linear regression of expression 
values against age and sex, and then assessed differential expression between the 
autism and control groups using the residual values. We observed a 96% overlap 
between differentially expressed genes using either the residual values or the raw 
data, indicating that neither age nor sex were major drivers of expression differ- 
ences between the autism and control groups. 

Differential expression between frontal and temporal cortex was assessed by a 
paired modified t-test (SAM) using the 13 autism and 13 control cases for which 
RNA samples from both cortex areas passed the quality control criteria. For each 
of the 510 genes that were differentially expressed in control samples between 
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frontal and temporal cortex, we compared the variance of autism and control 
expression values in frontal cortex and temporal cortex. The homogeneity of 
variance (homoscedasticity) of gene expression was assessed using the Barlett test 
in R. Fifty one genes showed a significant difference in variance (P < 0.05, Barlett 
test) between autism and control groups both in frontal and temporal cortex, and 
the Barlett test P-values for these genes are listed in Supplementary Data. 
WGCNA. Unsigned co-expression networks were built using the WGCNA pack- 
age in R. Probes with evidence of robust expression (9,914; see above) were 
included in the network. Network construction was performed using the 
blockwiseModules function in the WGCNA package”, which allows the network 
construction for the entire data set. For each set of genes a pair-wise correlation 
matrix is computed, and an adjacency matrix is calculated by raising the correla- 
tion matrix to a power. The power of 10 was chosen using the scale-free topology 
criterion’ and was used for all three networks: the network built using autism 
samples only, controls samples only or all samples. An advantage of weighted 
correlation networks is the fact that the results are highly robust with respect to the 
choice of the power parameter. For each pair of genes, a robust measure of network 
interconnectedness (topological overlap measure) was calculated based on the 
adjacency matrix. The topological overlap based dissimilarity was then used as 
input for average linkage hierarchical clustering. Finally, modules were defined as 
branches of the resulting clustering tree. To cut the branches, we used the hybrid 
dynamic tree-cutting because it leads to robustly defined modules’. To obtain 
moderately large and distinct modules, we set the minimum module size to 40 
genes and the minimum height for merging modules at 0.1. Each module was 
summarized by the first principal component of the scaled (standardized) module 
expression profiles. Thus, the module eigengene explains the maximum amount of 
variation of the module expression levels. For each module, we defined the module 
membership measure (also known as module eigengene based connectivity KME) 
as the correlation between gene expression values and the module eigengene. 
Genes were assigned to a module if they had a high module membership to the 
module (kME > 0.7). An advantage of this definition (and the kME measure) is 
that it allows genes to be part of more than one module. Genes that did not fulfil 
these criteria for any of the modules are assigned to the grey module. For the cell 
type marker enrichment analysis we used the markers defined experimentally in 
refs 32 and 33 which were previously used to annotate human brain network 
modules****. 

Module visualization: the topological overlap measure was calculated for the top 
100 genes in each module ranked by kME. The resulting list of gene pairs was 
filtered so that both genes in a pair had the highest kME for the module plotted 
(that is, most module-specific interactions). The resulting top 150 gene pairs were 
plotted using Visant. 

Gene ontology analyses. Functional enrichment was assessed using the DAVID 
database http://david.abcc.ncifcrf.gov/. For differentially expressed genes and co- 
expression modules, the background was set to the total list of genes expressed in 
the brain in the cortex data set. For genes containing differentially spliced exons, 
the background was set to the total set of genes showing evidence of alternative 
splicing in our RNA-seq data. The statistical significance threshold level for all 
gene ontology enrichment analyses was P < 0.05 (Benjamini and Hochberg cor- 
rected for multiple comparisons). 

Statistical analyses. All gene set overlap analyses were performed by assessing the 
cumulative hypergeometric probability using the phyper function in R. The popu- 
lation size was defined as the total number of probes expressed in both data sets. If 
the comparison involved different platforms, the comparison was done at gene 
level. 

Quantitative RT-PCR. One microgram of total RNA was treated with RNase- 
free DNase I (Invitrogen/Fermentas) and reverse-transcribed using Invitrogen 
Superscript II reverse-transcriptase and random hexanucleotide primers 
(Invitrogen). Real time PCR was performed on an ABI7900 cycler in 10 pl volume 
containing iTaq Sybrgreen (Biorad) and primers at a concentration of 0.5 uM 
each. The results shown in Supplementary Fig. 2b represent at least two independ- 
ent cDNA synthesis experiments for each gene. GAPDH levels were used as an 
internal control. Statistical significance was assessed by a two-tailed t-test assum- 
ing unequal variance. 

Semi-quantitative RT-PCR. Total RNA (600ng) pooled from autism cases 
(n = 2-3) or controls (n = 2-3) was reverse-transcribed as described above. 
cDNA (50 ng) was subjected to 30 cycles of PCR amplification using the primers 
described in Supplementary Table 11. PCR products were separated on a 3% 
agarose gel stained with GelStar (Lonza). 

RNA sequencing and data analysis.. 73-nucleotide reads were generated using an 
Illumina GAII sequencer according to the manufacturer’s protocol. To generate 
sufficient read coverage for the quantitative analysis of alternative splicing events, 
reads for ASD and control brain samples were separately pooled and aligned to an 
existing database of EST and cDNA-derived alternative splicing junctions using 
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the Basic Local Alignment Tool (BLAT) as described previously***’. Reads were 
considered properly aligned to a splice junction if at least 71 of the 73 nucleotides 
matched and at least 5 nucleotides mapped to each of the two exons forming the 
splice junction. Alternative exon inclusion values (“%inc’), representing the pro- 
portion of messenger RNA transcripts with the alternatively spliced exon included, 
were calculated for each mRNA pool as the ratio of reads aligning to the C1-A or 
A-C2 junctions against reads aligning against all three possible junctions as previ- 
ously described** (C1-A, A-C2, C1-C2, see Supplementary Fig. 3). Calculated %inc 
values were considered reliable if at least one of the included junctions as well as 
the skipped junctions were covered by at least 20 reads. %inc values were com- 
pared across samples using Fisher’s exact test and the Bonferroni-Hochberg cor- 
rection to identify differentially spliced exons associated with autism. Differential 
splicing events were considered significant if they fulfilled both criteria of 
FDR<0.1 and %inc difference between autism and controls >15%. 

GWAS set enrichment analysis. GWAS enrichment analysis was performed as 
previously described in ref. 38 with the main modification that we generated the 
null distribution, using permutation of gene labels rather than permutation of case/ 
control labels, because the raw genotyping data was not available for all data sets. 
This approach has been proposed as an acceptable alternative to phenotype label 
permutation** and has been previously used for set enrichment analyses of GWAS 
data®’. For all genes that met the robust expression criteria in our data set, we 
mapped the SNPs present on the Illumina 550k platform located within the tran- 
script boundaries and an additional 20 kb on the 5’end and 10kb on the 3’end. 
Each gene was assigned a GWAS significance value consisting of the lowest P value 
of all SNPs mapped to it. A gene set enrichment score (ES) based on the 
Kolmogorov—Smirnov statistic was calculated as previously described** using the 
—log(P-value). The null distribution was generated by 10,000 random permuta- 
tions of gene labels in the list of genes/P-value pairs and an enrichment score ESp 


was calculated for each permutation. To correct for the gene set size, the enrichment 
scores were scaled by subtracting the mean and dividing by the standard deviation 
of ESp. The resulting z-scores were used to calculate the significance p value. 
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Detection of prokaryotic mRNA signifies microbial 
viability and promotes immunity 


Leif E. Sander', Michael J. Davis*, Mark V. Boekschoten**, Derk Amsen“, Christopher C. Dascher', Bernard Ryffel°, 


Joel A. Swanson”, Michael Miiller? & J. Magarian Blander' 


Live vaccines have long been known to trigger far more vigorous 
immune responses than their killed counterparts’*. This has been 
attributed to the ability of live microorganisms to replicate and 
express specialized virulence factors that facilitate invasion and 
infection of their hosts’. However, protective immunization can 
often be achieved with a single injection of live, but not dead, 
attenuated microorganisms stripped of their virulence factors. 
Pathogen-associated molecular patterns (PAMPs), which are 
detected by the immune system*”, are present in both live and 
killed vaccines, indicating that certain poorly characterized aspects 
of live microorganisms, not incorporated in dead vaccines, are 
particularly effective at inducing protective immunity. Here we 
show that the mammalian innate immune system can directly sense 
microbial viability through detection of a special class of viability- 
associated PAMPs (vita-PAMPs). We identify prokaryotic messenger 
RNA as a vita-PAMP present only in viable bacteria, the recognition 
of which elicits a unique innate response and a robust adaptive 
antibody response. Notably, the innate response evoked by viability 
and prokaryotic mRNA was thus far considered to be reserved for 
pathogenic bacteria, but we show that even non-pathogenic bacteria 
in sterile tissues can trigger similar responses, provided that they 
are alive. Thus, the immune system actively gauges the infectious 
risk by searching PAMPs for signatures of microbial life and thus 
infectivity. Detection of vita-PAMPs triggers a state of alert not 
warranted for dead bacteria. Vaccine formulations that incorporate 
vita-PAMPs could thus combine the superior protection of live 
vaccines with the safety of dead vaccines. 

We hypothesized that the innate immune system might sense the most 
fundamental characteristic of microbial infectivity, microbial viability 
itself, and activate a robust immune response regardless of the presence 
of more specialized factors that regulate microbial virulence’. To study 
the sensing of bacterial viability without the compounding effects of 
replication or virulence factors, we used thymidine auxotrophs of non- 
pathogenic Escherichia coli K12, strain DH5o (hereafter called thyA™ E. 
coli). Viable and heat-killed thyA” E. coli similarly activated nuclear 
factor-kB (NF-«B) and mitogen-activated protein kinase p38 (Sup- 
plementary Fig. 1) in murine bone-marrow-derived macrophages and 
elicited production of similar amounts of interleukin-6 (IL-6) and 
tumour necrosis factor-« (TNF-«) (Fig. 1a). In contrast, viable thyA™ 
E. coli induced higher levels of IFN-f than heat-killed thyA” E. coli or 
lipopolysaccharide (LPS) (Fig. 1b), and only viable thyA” E. coli induced 
IL-1 secretion (Fig. 1c and Supplementary Fig. 2). Pro-IL-1f transcrip- 
tion was equally induced by both viable and heat-killed thyA” E. coli 
(Fig. 1c), indicating that viable bacteria specifically elicit cleavage of pro- 
IL-1. This process is catalysed by caspase-1 in Nod-like receptor (NLR)- 
containing inflammasome complexes, the assembly of which can be 
triggered by the activity of bacterial virulence factors'®''. Notably, aviru- 
lent viable but not heat-killed thyA™ E. coli induced inflammasome 


activation and pro-caspase-1 cleavage (Fig. 1d). Finally, viable but not 
heat-killed thyA ~ E. coli induced caspase-1-dependent inflammatory cell 
death, termed pyroptosis'®”, resulting in the release of lactate dehydro- 
genase (LDH) (Fig. le) and the appearance of 7-amino-actinomycin D 
(7AAD)‘annexin-V"W cells (Fig. 1f). Similar responses were observed 
in peritoneal macrophages and both splenic and bone-marrow-derived 
dendritic cells (Supplementary Fig. 2b). Killing thyA " E. coliby ultraviolet 
irradiation, antibiotics, or ethanol also selectively abrogated IL-1 secre- 
tion and pyroptosis without affecting IL-6 production (Fig. 1g and 
Supplementary Fig. 3), indicating that a general determinant associated 
with bacterial viability is detected. 

To determine whether pathogenic bacteria can also activate the 
inflammasome in the absence of virulence factors, we studied attenuated 
strains of selected pathogens: Shigella flexneri virulence plasmid-cured 
strain BS103", Salmonella enterica serovar Typhimurium SL1344 
ASpilASpi2, lacking the Salmonella pathogenicity islands SPI-1 and 
SPI-2 (ref. 10), and Listeria monocytogenes AHIyAfliC, lacking listerio- 
lysin O and flagellin’®. These mutants induced IL-18 production at 
levels comparable to those induced by thyA  E. coli (Fig. 1h), but lower 
and with slower kinetics than their pathogenic counterparts (Sup- 
plementary Fig. 4). IL-1 production was abolished when these bacteria 
were killed, whereas IL-6 production was similar (Fig. lh). Thus, 
immune cells detect universal characteristics of viability different from 
virulence factors. 

Caspase-1 activation, pyroptosis and IL-1B production in response 
to thyA E. coli were abrogated in macrophages deficient for NLRP3 or 
for the inflammasome adaptor apoptosis speck protein with caspase 
recruitment (ASC or PYCARD)'' (Fig. 1i, j), whereas NLRC4 was 
dispensable (Supplementary Fig. 5). Pyroptosis and IL-1 production 
induced by viable thyA” E. coli were abrogated in Casp1 /~ macro- 
phages (Fig. 1j) and suppressed by inhibitors for caspase-1, but not 
caspase-8 (Supplementary Fig. 6). 

Induction of IFN-B mRNA and protein by viable thyA” E. coli 
required the Toll-like receptor (TLR) adaptor TRIF’ (Fig. 2a, b) and 
downstream interferon regulatory factor-3 (IRF3)’ (Supplementary 
Fig. 7), but not MyD88, the main TLR adaptor? (Fig. 2a, b). In contrast, 
transcription of pro-IL-1B was largely dependent on MyD88. 
Consequently, Myd88’ ~ cells secreted no IL-1 (Fig. 2c, d), whereas 
pyroptosis and caspase-1 cleavage were intact (Fig. 2e, f). Notably, 
although TRIF was dispensable for pro-IL-18 transcription (Fig. 2c), 
Trif ’~ cells failed to secrete IL-1 (Fig. 2d), were protected from pyr- 
optosis (Fig. 2e) and did not activate caspase-1 (Fig. 2f). These findings 
revealed an unexpected role for TRIF in NLRP3 inflammasome activa- 
tion in response to viable thyA ” E. coli. In contrast, pyroptosis induced 
by pathogenic S. enterica Typhimurium”? proceeded independently of 
TRIF (Supplementary Fig. 8). Differential involvement of TRIF, together 
with differences in magnitude and kinetics of the response (Fig. 1h and 
Supplementary Fig. 4), indicated that inflammasome activation in 
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Figure 1 | Sensing bacterial viability induces IFN-B and activates the NLRP3 
inflammasome in the absence of virulence factors. a, b, IL-6 and TNF-« 
(a) and IFN-B protein and mRNA (at 2h) (b) levels in murine BMMs 
stimulated with medium (ctrl), lipopolysaccharide (LPS), thyA  E. coli (EC) 
and heat-killed thyA™ E. coli (HKEC). Multiplicity of infection = 20. c, IL-1B 
(top), I/1b mRNA (bottom, left y axis) and secreted IL-1 (bottom, right y axis) 
at indicated times is shown. d, i, Caspase-1 immunoblots at 18 h in wild-type 
(d) or wild-type (WT), Nirp3 '~ and Asc_'~ BMMs(i).¢, £ Pyroptosis by LDH 
release (e) and FACS (f) at 18h is shown. g, h, IL-6 and IL-1 in response to 


response to virulence factors occurs in a manner distinct from that to 
viability. 

Genome-wide transcriptional analysis of wild-type and Trif ’~ 
macrophages before and after phagocytosis of viable thyA” E. coli 
showed differential regulation of several clusters of genes (Supplemen- 
tary Fig. 9) including IFN-regulated genes, as expected’ (Fig. 2a, b and 
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thyA  E. coli, viable or killed by different means (g, bone-marrow-derived 
dendritic cells (BMDCs)) or viable or heat-killed thyA  E. coli, attenuated 
thyA~ Shigella (BS103), Salmonella (ASpi1/2) and Listeria (AHlyAfliC), or 
virulent Salmonella SL1344 (h). C3 H;OHEC, ethanol-killed E. coli; GentEC, 
gentamicin-killed E. coli; UVEC, UV irradiated E. coli. j, LDH, IL-1B and IL-6 
in BMMs of the indicated genotype in response to medium (ctrl), viable thyA — 
E. coli (EC) and heat-killed thyA — E. coli (HKEC). All responses are by murine 
BMMs and measured at 24h unless indicated otherwise. Hash symbol indicates 
not detected. Data represent =5 experiments. All bars represent mean = s.e.m. 


Supplementary Fig. 10a), whereas most of the Rel/NF-«B target genes 
were comparable (Supplementary Fig. 10b). Nirp3 expression was 
induced independently of TRIF (Fig. 2g, h), and negative regulators 
of inflammasome activity, such as those encoded by Mediterranean 
fever (Mefv), Nirp10 and Casp12 genes, were also unchanged or 
expressed at higher levels in wild-type macrophages (Fig. 2g), possibly 
due to negative feedback. Thus, the role of TRIF in inflammasome 
activation upon phagocytosis of viable thyA  E. coli is not explained 
by transcriptional control of inflammasome components (so called 
priming"). Furthermore, ATP and reactive oxygen species (ROS)'?, 
known activators of the NLRP3 inflammasome, were not involved, as 
deficiency for P;X7R, which is required for ATP-mediated NLRP3 
activation, did not affect pyroptosis or IL-1 production (Supplemen- 
tary Fig. 1la, b), and ROS accumulated equally in response to viable 
and heat-killed thyA” E. coli independently of TRIF (Supplementary 
Fig. 11c). 


Figure 2 | The TLR signalling adaptor TRIF controls ‘viability-induced’ 
responses. a-e, [fnb transcription at 2h (a), IFN-B secretion at 24h (b), I/1b 
transcription at 2h (c), IL-1 secretion (d) and LDH release (e) at 24h after 
phagocytosis of viable (EC) or heat-killed (HKEC) thyA” E. coli. f, Caspase-1 
immunoblot at 18h. Data in a-f are from murine BMMs and represent =5 
experiments. g, Gene microarray analysis of wild-type and Trif '~ BMMs 
treated with viable thyA  E. coli for 1, 3 or 6h (three biological replicates, 
numbered 1-3). A heat map of positive regulators/essential components (+) 
and negative regulators (—) of inflammasomes is shown. h, Nirp3 transcription 
at 1h in BMMs. i,j, Serum levels of IL-6 and IL-1 6h after injection of 1 x 10° 
viable or 5 X 10° heat-killed thyA  E. coli (i), and splenic bacterial burdens 72 h 
after injection of 1 X 10° non-auxotroph E. coli (j) into wild-type, Trif ~, 
Asc ‘~ and Nirp3 ‘~ mice are shown. Each symbol represents one mouse. *, 
P=0.05; **, P=0.01; ***, P=0.001. NS, not statistically significant. Hash 
symbol indicates not detected. All bars represent mean + s.e.m. 
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Injection of viable and heat-killed thyA” E. coli into mice induced 
similarly high serum levels of IL-6 (Fig. 2i). In contrast, circulating IL- 
1B was detected only in mice infected with viable bacteria (Fig. 2i), 
whereas IFN-f levels were undetectable in all groups (data not shown). 
Confirming our results in vitro, production of IL-1 (but not IL-6) 
in vivo also required TRIF, ASC and NLRP3 (Fig. 2i). Injection of 
non-pathogenic S. enterica Typhimurium induced serum IL-1 levels 
comparable to those elicited by thyA” E. coli, which similarly 
depended on TRIF (Supplementary Fig. 12). Although pathogenic 
S. enterica Typhimurium elicited higher levels of serum IL-1B than 
non-pathogenic Salmonella, this response was also severely reduced in 
Trif '~ mice, suggesting a previously unappreciated role for TRIF in 
Salmonella infection (Supplementary Fig. 12). Importantly, deficiency 
in TRIF, ASC and NLRP3 impaired bacterial clearance during sys- 
temic infection with replication-sufficient non-pathogenic E. coli 
(Fig. 2). This failure was more dramatic in Trif ‘~ than in Asc /~ 
or Nirp3 ‘~ mice, possibly due to the central upstream role of TRIF in 
inflammasome activation and IFN-B production. 

The ability to sense microbial viability through pathways down- 
stream of pattern recognition receptors indicates the existence of vita- 
PAMPs; that is, PAMPs associated with viable but not dead bacteria. In 
contrast to LPS and genomic DNA, which remained constant after 
killing thyA” E. coli with heat, total bacterial RNA was rapidly lost 
(Fig. 3a, b and Supplementary Fig. 13). Total RNA content was also 
lost with antibiotic treatment, and little ribosomal RNA (rRNA) 
remained after killing with ultraviolet irradiation and ethanol 
(Supplementary Fig. 14). Only fixation with paraformaldehyde (PFA) 
efficiently killed the bacteria (not shown) while preserving total RNA 
content (Supplementary Fig. 15a). Remarkably, unlike bacteria killed by 
other means, PFA-killed bacteria induced pyroptosis and IL-1 pro- 
duction to levels similar to those induced by viable bacteria 
(Supplementary Fig. 15b). Thus, the presence or absence of RNA cor- 
related with the ability to activate pathways involved in sensing viability. 

These results indicate that prokaryotic RNA represents a labile 
PAMP closely associated with bacterial viability that might signify 
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microbial life to the immune system. Indeed, addition of purified total 
bacterial RNA fully restored the ability of heat-killed thyA” E. coli to 
induce pyroptosis, IL-18 and IFN- production (Fig. 3c). These res- 
ponses were dependent on TRIF, NLRP3 and caspase-1, just as those 
responses elicited by viable bacteria (Fig. 3d compared to Figs 1j and 
2a-f). The NLRP3 inflammasome mediates recognition of viral RNA 
during influenza A infection’*. Together with our results and those of 
others’*, this suggests a more general role for NLRP3 in responses to 
RNAs of microbial origin. RNA can activate the NLRP3 inflamma- 
some when delivered into the cytosol (where NLRP3 is found) with 
transfection reagents’°. In contrast, inflammasome activation by the 
combination of total bacterial RNA and dead thyA E. coli did not 
require RNA transfection (Fig. 3c, d). Administration of total E. coli 
RNA alone or in combination with LPS (to mimic an E.-coli-derived 
PAMP plus RNA) had little effect on NLRP3 inflammasome activation 
unless the RNA was delivered to the cytosol using Lipofectamine 
(Supplementary Fig. 16) or in combination with ATP, as reported 
previously’*. Thus, phagocytosis of viable bacteria is a natural context 
of bacterial-RNA-mediated NLRP3 inflammasome activation. 

These findings raised the question as to how vita-PAMPs in 
phagolysosomes gain access to cytosolic receptors such as NLRP3 in 
the absence of invasion, auxiliary secretion systems or pore-forming 
toxins. To address this question, we exploited the pH-sensitive excita- 
tion spectrum of fluorescein: the acidic pH in phagolysosomes 
quenches fluorescence whereas release into the pH-neutral cytosol 
allows a regain in fluorescence’. Phagocytosis of avirulent thyA™ 
E. coli in the presence of fluorescein-conjugated dextran (Fdx) con- 
sistently induced low-level release of Fdx into the cytosol of macro- 
phages (Fig. 3e, f and Supplementary Fig. 17). This indicates that 
phagosomes carrying E. coli exhibit intrinsic leakiness, a property 
previously described for particles such as beads and crystals that 
induce phagolysosomal destabilization’®”’. Interestingly, killed E. coli 
also induced Fdx release, although to a slightly lower extent than viable 
E. coli (Fig. 3e, f), demonstrating that phagosomal leakage occurs 
independently of bacterial viability. Therefore, RNA from viable 


Figure 3 | Bacterial RNA is a vita-PAMP that 
accesses cytosolic receptors during phagocytosis 
and in the absence of virulence factors. a, LPS/ 
endotoxin, genomic DNA and total RNA in thyA — 
E. coli before and at indicated times after heat 
killing. b, Agarose gel electrophoresis of thyA  E. 
coli total RNA before and after heat killing at 60 °C 
for 60 min followed by 4 °C incubation for the 
indicated times. c, d, LDH, IL-1, IFN-B and IL-6 
at 24h in response to viable thyA  E. coli (EC), 
heat-killed thyA  E. coli (HKEC), or heat-killed 
thyA” E. coli with 10 pg ml! total RNA 
(HKEC+RNA). Hash symbol in c and d indicates 
not detected. Data in a-d are from murine BMMs 
and represent =5 experiments. e, Representative 
ratiometric epifluorescence imaging of murine 
BMMs at 8h with Fdx alone (ctrl 8h), Fdx and 
viable thyA  E. coli (EC 8h) or gentamicin-killed 
thyA E. coli (GentEC). Colour code indicates pH 
scale. Positive control is ground silica (silica 1h). 
f, Quantification of cytosolic Fdx expressed as 
percentage of total Fdx per cell. Each dot represents 
the percentage of released Fdx per individual cell. 
Grey bars represent mean Fdx release. *, P= 0.05; 
**P = (0).01; ***, P<0.001. All bars represent 
mean + s.e.m. 
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bacteria could gain access to cytosolic receptors via intrinsic pha- 
gosomal leakage. These results may also explain the reported ability 
of phagosome-degraded mutants of Listeria monocytogenes or 
Staphylococcus aureus to induce a transcriptional response dependent 
on cytosolic NLRs'*”. 

Digestion of total RNA from E. coli with exonuclease RNase I and 
double-stranded RNA (dsRNA)-specific endonuclease RNase III 
abrogated LDH and IL-1 release, whereas DNase treatment had no 
effect (Fig. 4a). Of the E. coli RNA species, mRNA most potently 
induced pyroptosis as well as production of IL-1B and IFN-B. Small 
RNA (sRNA), or the most abundant RNA, ribosomal RNA (rRNA), 
had little or no detectable effects (Fig. 4b and Supplementary Fig. 18). 
Escherichia coli rRNA undergoes extensive modifications not found in 
mRNA”, which may underlie the differential activity of these RNA 
species. The relative amount of mRNA was <1% of the total RNA and 
accordingly, mRNA was approximately 100-fold more effective than 
total RNA (Figs 3c and 4a, b and Supplementary Fig. 18). 

In-vitro-transcribed single-stranded mRNA of the E. coli Gro 
operon (Supplementary Fig. 19a, b), which is strongly expressed upon 
phagocytosis of bacteria”’, induced caspase-1 cleavage and subsequent 
pyroptosis and IL-1B production when phagocytosed together with 
heat-killed thyA E. coli (Fig. 4c, e and Supplementary Fig. 19c-e). The 
single-stranded Gro mRNA sequence had a predicted secondary struc- 
ture with regions of high probability for base pairing (Fig. 4d), con- 
sistent with susceptibility of the stimulatory activity to RNase III 
treatment (Fig. 4a). Indeed, fully dsGro mRNA (Supplementary Fig. 
19b) induced responses similar to single-stranded Gro mRNA of the 
appropriate length (Fig. 4e and Supplementary Fig. 19d). Other tran- 
scripts also induced such responses, showing that the immunostimu- 
latory property is independent of RNA sequence (Fig. 4f). 

Notably, eukaryotic RNA was unable to elicit the responses induced 
by E. coli mRNA (Fig. 4b). Unlike eukaryotic mRNA, triphosphate 
moieties at the 5’ end of bacterial mRNAs are not capped with 
7-methyl-guanosine (7m’G)”, and might betray the prokaryotic 
origin of these transcripts’. However, neither treatment with calf 
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Figure 4 | Bacterial mRNA constitutes an active vita-PAMP. a-—c, e-g, LDH, 
IL-1B and IL-6 at 24h. a, Total thyA” E. coli RNA treated with RNase I and 
RNase III, RNase III alone, or DNase before stimulation of BMDCs. b, BMMs 
treated with viable or heat-killed thyA  E. coli, or heat-killed thyA  E. coli with 
0.1 pg ml ! of different bacterial RNA (ribosomal RNA (rRNA), mRNA, small 
RNA (sRNA) or eukaryotic RNA (eukRNA)). ¢, BMDC responses. Gro RNA 
indicates in-vitro-transcribed E. coli Gro operon RNA. d, Predicted secondary 
structure of Gro RNA. The colour code indicates base pairing probability. 

e, BMMs treated with in-vitro-transcribed Gro RNA or Gro dsRNA alone or 
with heat-killed thyA  E. coli. f, BMDC responses. Era RNA and DNApol RNA 
indicate in-vitro-transcribed E. coli Era GTPase and DNA polymerase III RNA, 
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intestinal phosphatase (CIP) nor capping affected the activity of Gro 
mRNA during phagocytosis of heat-killed thyA” E. coli (Fig. 4g). The 
stimulatory activity of purified E. coli total RNA or mRNA was also 
unaltered by CIP treatment (Supplementary Fig. 20a, b), arguing 
against a role for the RNA helicase retinoic acid inducible gene-I 
(RIG-I), which can induce interferon and IL-1B production but 
requires 5'-triphosphates for activation (Supplementary Fig. 20c)”. 
Moreover, TRIF and NLRP3 are dispensable for RIG-I function but 
are required for the stimulatory activity of bacterial RNA (Figs 2a, b 
and 3d). Interestingly, RNA can induce RIG-I-dependent IFN-B during 
infection with an invasive intracellular bacterium”, indicating that the 
nature of microbial pathogenesis and the cellular context in which 
bacterial RNA is recognized may determine the choice of innate sensors 
engaged. In contrast to 5’-triphosphate removal, adding polyadenylyl 
groups to the 3’ end of Gro mRNA or purified E. coli mRNA abrogated 
IL-1 secretion and pyroptosis (Fig. 4g and Supplementary Fig. 21). 
Thus, absence of 3'-polyadenylation” may allow specific detection of 
prokaryotic mRNA during infection. Additional features may distin- 
guish self from microbial RNAs such as internal naturally occurring 
nucleoside modifications in eukaryotic RNA”. 

To test the impact of vita~-PAMPs on adaptive immunity, we immu- 
nized mice with either viable or dead thyA  E. coli, or a combination of 
dead thyA E. coli and purified total bacterial RNA (Supplementary 
Fig. 22). Whereas all three vaccines induced similar polyclonal anti-E. 
coli IgM responses, production of class-switched IgG subclasses was 
strongly enhanced in response to vaccination with viable compared to 
killed E. coli (Fig. 4h). Adding total bacterial RNA to killed thyA™ E. 
coli elevated IgG1, IgG2c, IgG2b and IgG3 antibody titres to or above 
the levels in mice immunized with viable thyA  E. coli. Thus, innate 
detection of bacterial viability leads to robust activation of a humoral 
adaptive response. These findings indicate that bacterial RNA can 
augment killed vaccines to perform as well as live ones. 

Our findings reveal an inherent ability of the immune system to 
distinguish viable from dead microorganisms. The presence of live 
bacteria in sterile tissues, regardless of whether these (still) express 
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respectively. g, BMMs treated with different doses of unmodified (ctrl, control) 
or modified Gro RNA with heat-killed thyA™ E. coli (5’ cap, 5’ m’G capping; 
CIP, calf intestinal phosphatase; 3’ poly(A), 3’-polyadenylation). For a—g, the 
hash symbol indicates not detected; all RNA at 10 ug ml ~ . except as noted; data 
represent =5 experiments. h, Mice vaccinated and boosted twice with viable 
thyA  E. coli (EC), heat-killed thyA  E. coli (HKEC) or heat-killed thyA  E. coli 
with 30 1g total purified bacterial RNA (HKEC+RNA) (vaccination regimen is 
given in Supplementary Fig. 22). Class-specific anti-E. coli antibody serum 
titres at 25 days are shown. *, P= 0.05; **, P=0.01; ***, P=0.001. All bars 
represent mean + s.e.m. 
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virulence factors, poses an acute threat that must be dealt with by an 
aggressive immune response. Dead bacteria, on the other hand, would 
signify a successful immune response that can now subside. Detection 
of vita-PAMPs within sterile tissues signifies microbial viability. Other 
vita-PAMPs may exist in the form of second messengers like cyclic di- 
adenosine or di-guanosine monophosphates””* or quorum-sensing 
molecules’. The extent to which vita-PAMPs contribute to the host 
response during natural infection with pathogenic bacteria, relative to 
other stimuli such as the activity of virulence factors, is an important 
issue that requires further investigation. Given that bacteria tightly 
regulate their virulence via multiple mechanisms in response to dif- 
ferent environmental signals and inside a host organism during infec- 
tion??*’, detection of invariant vita-PAMPs essential to bacterial 
survival may be a non-redundant fail-safe strategy for host protection. 


METHODS SUMMARY 


Cells were infected with E. coli DH5a thyA at a multiplicity of infection of 20 for 
24h unless stated otherwise. Supernatants were assayed for cytokines by ELISA. 
Genome-wide transcriptional analysis of murine bone-marrow-derived macro- 
phages (BMMs) at 0, 1, 3 and 6h after infection was carried out on Affymetrix 
GeneChip Mouse Gene 1.1 ST 24-array plates. Phagosomal leakage in BMMs was 
detected by measuring Fdx release using a modified method previously 
described'®. In brief, BMMs were treated with thyA” E. coli in the presence of 
0.167 mg ml! Fdx and imaged with excitation at 440 nm (pH insensitive) and 
485 nm (pH sensitive). Fluorescence intensity ratios at 485 nm/440 nm were con- 
verted into pH maps and the percentage of Fdx release calculated (total intensity of 
pixels containing released Fdx/total Fdx intensity). Bacterial RNA was extracted 
from E. coli using the e.z.n.a RNA kit (Omega) and in vitro transcription of 
bacterial genes carried out using the MEGAscript kit (Ambion) followed by 
DNase digestion and RNA purification using the MEGAclear kit (Ambion). 
RNA polyadenylation was performed with the poly(A)-tailing kit (Ambion). 
Vaccinations were performed as a prime-boost regimen (see Methods). C57BL/ 
6J and P2rx7'~ mice were purchased from the Jackson Laboratory. Myd88‘~ 
and Trif /~ mice were provided by S. Akira, Trif /~ x Myd88/~ by R. Medzhitov, 
Nirp3 '~, Asc '~ and Nirc4 '~ by Millenium, and Casp1/~ byR. Flavell. Animal 
care and experimentation were performed in accordance with approved MSSM 
Institutional Animal Care and Use Committee protocols. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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LETTER 


METHODS 


Cells. Bone-marrow-derived dendritic cell (BMDC) cultures were grown as previ- 
ously described*’ in RPMI 1640 supplemented with granulocyte-macrophage 
colony-stimulating factor (GM-CSF) and 5% fetal bovine serum (FBS), plus 100 pg 
ml! penicillin, 100 pg ml! streptomycin, 2mM 1-glutamine, 10 mM HEPES, 
1nM sodium pyruvate, 1% MEM non-essential amino acids, and 2.5M B- 
mercaptoethanol (all Sigma). Semi-adherent cells were harvested on ice on day 
5 and re-plated immediately in fresh RPMI 1640 medium containing 10% FBS at 
5 X 10° cells per well in 24-well tissue-culture-treated plates. Stimuli were added 
immediately after re-plating in the same medium and the cells were centrifuged for 
2 min at 2,000 r.p.m. Murine macrophages were derived from the bone marrow 
(BMMs) of C57BL/6J, Mydss-'~, Trif'~, Trif'xMyd88'~, Nirp3‘, 
Asc '~ or Casp1/~ mice, as described previously”, in RPMI 1640 supplemented 
with M-CSF and 10% FBS, plus 100 tg ml penicillin, 100 1g ml’ streptomycin, 
10mM HEPES and 1 nM sodium pyruvate (all Sigma). For some experiments 
macrophages were derived from the bone marrow of Irf3_'~ or P2rx7‘~ mice. 
Peritoneal macrophages were harvested 72 h after intraperitoneal injection of 1 ml 
thioglycollate (BD Bioscience), grown overnight in RPMI 1640 medium supple- 
mented with 10% FBS and 100 pg ml * penicillin and 100 pg ml streptomycin, 
hereafter referred to as “complete medium’. Mouse embryonic fibroblasts (MEFs) 
deficient for RIG-I (RIG-I‘~) were provided by A. Ting with permission from S. 
Akira, and grown in DMEM medium containing 10% FBS and 100 pg ml peni- 
cillin, 100 jig ml’ streptomycin. 

Mice. C57BL/6J and P2rx7‘~ mice were purchased from Jackson Laboratories. 
Myd88 ‘~ and Trif ‘~ mice were originally provided by S. Akira; Myd88‘~ and 
Trif '~ mice were interbred to homozygosity to generate Trif ‘x Myd88 /~ 
mice, and were provided by R. Medzhitov. Nirp3'~, Asc-'~ or Casp1‘~ bone 
marrow was provided by B. Ryffel and mice for in vivo studies were acquired from 
R. Flavell (through Millenium) and have been described previously“. Irf3/~ 
mice were provided by C. B. Lopez and were previously described”. We used 8-10- 
week-old animals for all experiments. All experiments were approved by the 
institutional ethics committee and carried out in agreement with the ‘Guide for 
the Care and Use of Laboratory Animals’ (NIH publication 86-23, revised 1985). 
Bacteria. Escherichia coli K12, strain DH5o were purchased from Invitrogen. 
Naturally occurring thymidine auxotrophs (thyA ) were selected on Luria- 
Bertani (LB) agar plates containing 501g ml | trimethoprim and 500 pg ml! 
thymidine (both Sigma). Auxotrophy was confirmed by inoculation and overnight 
culture of single colonies in LB medium. thyA” E. coli grew only in the presence of 
thymidine and were resistant to trimethoprim. For phagocytosis experiments, 
thyA~ E. coli were grown to mid-log phase, washed three times in phosphate 
buffered saline (PBS) to remove thymidine and LB salts before addition to cells. 
For heat killing, thyA  E. coli were grown to log phase, washed and re-suspended 
in PBS at an optical density at 600 nm (OD¢00) of 0.6, and subsequently incubated 
at 60 °C for 60 min. thyA heat-killed E. coli were stored up to 18 hat 4 °C or used 
immediately after cooling. Efficient killing was confirmed by overnight plating on 
thymidine/trimethoprim-supplemented LB-agar plates. For gentamicin killing, 
thyA” E. coli were grown to mid-log phase, washed and re-suspended in LB 
medium containing thymidine, trimethoprim and 50 jig ml gentamicin sulphate 
and incubated in a shaking incubator at 37°C overnight. Ethanol killing was 
carried out by re-suspending log phase thyA  E. coli in 70% ethanol for 10 min, 
followed by extensive washing in PBS. For ultraviolet killing, log phase thyA E. 
coli were re-suspended in PBS at an ODgoo of 0.6, ultraviolet-irradiated with 
1,000 mJ cm? in a Petri dish followed by washing with PBS. Paraformaldehyde 
(PFA) fixation was performed by re-suspending log-phase thyA  E. coliin 4% PFA 
in PBS for 10min followed by extensive washing and re-suspension in PBS. 
Shigella flexneri virulence plasmid-cured strain BS103 was provided by M. B. 
Goldberg’*”*. thyA” S. flexneri were selected similarly to thyA” E. coli. D. M. 
Monack provided Salmonella enterica serovar Typhimurium, strain SL1344 
ASpilASpi2, lacking the Salmonella pathogenicity island SPI-1 and SPI-2 type- 
III secretion systems”. $L1344 ASpi1ASpi2 was grown in LB medium containing 
25ugml~' kanamycin and 124g ml! tetracycline. Listeria monocytogenes 
AHIyAfliC lacking listeriolysin O (LLO) and flagellin expression were provided 
by D. Portnoy”. 

Treatment of macrophages and dendritic cells with viable and killed bacteria. 
Macrophages were detached and re-plated 4h before the experiment. BMDCs 
were re-plated immediately before addition of bacteria or soluble ligands. 
Unless stated otherwise, bacteria were used at a multiplicity of infection of 20. 
All experiments were carried out in antibiotic-free ‘complete medium’. One hour 
after addition of bacteria, penicillin (100 pg ml 1) and streptomycin (100 pg ml~ 1) 
were added to the medium to kill any remaining extracellular bacteria. 
Alternatively, gentamicin sulphate (50 jig ml” ') was used. We also compared this 
approach to washing the cells and replacing the antibiotic-free medium with 
penicillin/streptomycin containing medium after 1h and found no differences 


with regards to the cellular responses measured. Supernatants were collected 
24h after the addition of the bacteria unless stated otherwise in the figure legends. 
Cytokine enzyme-linked immunosorbent assays. Supernatants from cultured 
BMMs or BMDCs were collected at 24 h after stimulation or at the times indicated. 
Enzyme-linked immunosorbent assay (ELISA) antibody pairs used for IL-6, IL-1B 
and TNF-« were as listed below. All ELISA antibodies were used at 21g ml ' 
capture and 0.5pgml' detection, with the exception of IL-6 capture, which 
was used at 1g ml’. Detection antibodies were biotinylated and labelled by 
streptavidin-conjugated horseradish peroxidase (HRP), and visualized by the 
addition of o-phenylenediamine dihydrochloride (Sigma) (from tablets) or 3,3’, 
5,5’ -tetramethylbenzidine solution (TMB, KPL). Colour development was stopped 
with 3 M H,SO, or TMB-Stop Solution (KPL), respectively. Recombinant cytokines 
served as standards and were purchased from Peprotech. Absorbances at 492 or 
450 nm were measured, respectively, on a tunable microplate reader (VersaMax, 
Molecular Devices). Cytokine supernatant concentrations were calculated by extra- 
polating absorbance values from standard curves where known concentrations 
were plotted against absorbance using SoftMax Pro 5 software. Capture/detection 
antibody pairs were as follows. IL-6, MP5-20F3/MP5-32C11 (BDPharmingen); 
IL-1, B12/rabbit polyclonal antibody (eBioscience); TNF-«, TN3-19/rabbit poly- 
clonal antibody (eBioscience). IFN-f production was measured from supernatants 
using the VeriKine Mouse IFN-Beta ELISA Kit (PBL Interferon source) following 
manufacturer’s instructions. 

Anti-E. coli antibody ELISA. 96-well microtitre plates were coated overnight 
with E. coli lysates (3 gml ') that we generated from log-phase cultures of 
thyA E. coli. Serum samples from immunized mice were serially diluted (12 
dilutions) and incubated in the pre-coated plates for 12h at 4°C followed by 
washing and incubation with rabbit anti-mouse isotype-specific Ig-HRP 
(Southern Biotech) for 1h. Bound rabbit anti-mouse Ig-HRP was visualized by 
the addition of o-phenylenediamine dihydrochloride (Sigma) from tablets, and the 
anti-E. coli antibody titres for each mouse were determined by absorbance read- 
ings at 490 nm. 

Measurement of inflammatory cell death. Cell death of macrophages or BMDCs 
was measured using the Cytotox96 cytotoxicity assay (Promega) following manu- 
facturer’s instructions. The assay measures the release of lactate dehydrogenase 
(LDH) into the supernatant calculated as the percentage of total LDH content, 
measured from cellular lysates (100%). LDH released by unstimulated cells was 
used for background correction. 

Flow cytometric assessment of cell death. Cells were stimulated overnight, 
stained for Annexin V/7AAD using the Annexin V-PE/7AAD Apoptosis 
Detection kit (BD Pharmingen), and analysed by flow cytometry (FACSCalibur, 
BD). 

Flow cytometric measurement of ROS production. BMMs were loaded with the 
ROS indicator dye H2DCFDA (Molecular Probes/Invitrogen, 10 mM in PBS) for 
30 min followed by a recovery time of 30min in fresh pre-warmed ‘complete 
medium’. BMMs were then stimulated with viable or heat killed E. coli for 
60 min, washed and analysed by flow cytometry (FACSCalibur, BD). 

Western blots. For detection of caspase-1, protein extracts were separated on 
4-12% SDS-gradient gels (Invitrogen). For detection of all other proteins, samples 
were run on 10% SDS-polyacrylamide gels. Proteins were transferred to PVDF 
membranes (Millipore). Membranes were blocked with 5% milk in PBS and probed 
with the following antibodies: caspase-1 p10 (M-20)/rabbit polyclonal antibody, 
IkBa (C-21)/rabbit polyclonal antibody (both from Santa Cruz Biotechnologies), 
phospho-IRF3 (Ser 396)/rabbit polyclonal antibody, IRF3/rabbit polyclonal 
antibody, phospho-p38 MAPK (Thr 180/Tyr 182)/rabbit polyclonal antibody, 
p38 MAPK/rabbit polyclonal antibody (all from Cell Signalling Technology), 
a-tubulin (DM1A)/rabbit monoclonal antibody (Novus Biologicals). 
Real-time PCR. Total RNA was isolated from macrophages using the RNeasy kit 
(Qiagen). Contaminating genomic DNA was removed by DNase digestion 
(DNase I, Promega). Reverse transcription was performed using Superscript III 
(Invitrogen) and cDNA was used for subsequent real-time PCR reactions. 
Quantitative real-time RT-PCR was conducted on an ABI Prism 7900 instrument 
using the Maxima SYBR green qPCR Master Mix (Fermentas) with the following 
primer pairs. B-Actin, FW 5’-GAAGTCCCTCACCCTCCCAA-3’, RV 5'-GGC 
ATGGACGCGACCA-3’; Il1b, FW 5'-AAAGACGGCACACCCACCCTGC-3’, 
RV 5'-TGTCCTGACCACTGTTGTTTCCCAG-3’; Ifnb, FW 5’-GCACTGGGT 
GGAAT-3’, RV 5'-TTCTGAGGCATCAA-3’; Nirp3, FW 5'-CGAGACCTCTG 
GGAAAAAGCT-3’, RV 5'-GCATACCATAGAGGAATGTGATGTACA-3’. All 
reactions were performed in duplicates and the samples were normalized to 
B-actin. ‘Fold inductions’ were calculated using the AAC‘ method relative to 
unstimulated BMMs. 

Transcriptome analysis. BMMs derived from wild-type or Trif ‘~ mice were 
stimulated with viable E. coli for 0, 1, 3 or 6 h and total RNA was extracted using the 
RNeasy kit (Qiagen). RNA from three independent experiments was used for 
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transcriptional analysis. RNA integrity was checked on an Agilent 2100 
Bioanalyser (Agilent Technologies) with 6000 Nano Chips. RNA was judged as 
suitable only if samples showed intact bands of 18S and 28S ribosomal RNA 
subunits, displayed no chromosomal peaks or RNA degradation products, and 
had a RNA integrity number (RIN) above 8.0. 

One-hundred nanograms of RNA were used for whole-transcript cDNA syn- 
thesis with the Ambion WT expression kit (Applied Biosystems). Hybridization, 
washing and scanning of an Affymetrix GeneChip Mouse Gene 1.1 ST 24-array 
plate was carried out according to standard Affymetrix protocols on a GeneTitan 
instrument (Affymetrix). 

Packages from the Bioconductor project, integrated in an in-house developed 
management and analysis database for microarray experiments, were used for 
analysis of the scanned arrays”. Arrays were normalized using the Robust 
Multi-array Average method". Probe sets were defined according to ref. 42. 
With this method probes are assigned to unique gene identifiers, in this case 
Entrez IDs. The probes on the Gene 1.1 ST arrays represent 19,807 genes that have 
at least 10 probes per identifier. For the analysis, only genes that had an intensity 
value of >20 on at least two arrays were taken into account. In addition, the inter- 
quartile range of log, intensities had to be at least 0.25. These criteria were met by 
9,921 genes. Changes in gene expression are represented as signal log ratios between 
treatment and control. Multiple Experiment Viewer software (MeV 4.6.1) was used 
to create heatmaps**“*. Genes were clustered by average linkage hierarchical cluster- 
ing using Pearson correlation. Significantly regulated genes were identified by 
intensity-based moderated t-statistics**. Obtained P-values were corrected for mul- 
tiple testing by a false discovery rate method”. 

IFN-regulated genes were identified using the Interferome database (http:// 
www.interferome.org)”’ and grouped in a heat map. Rel/NF-«B target genes were 
identified using another online database (http://bioinfo.lifl.fr/NF-KB/) which 
compiles Rel/NF-«B target genes identified by various groups** (http://people. 
bu.edu/gilmore/nf-kb/index.html). Inflammasome-related genes were compiled 
based on the current literature'’. 

Measuring release from bacterial phagosomes. Measurement of fluorescein- 
dextran (Fdx) release from macrophage phagosomes was performed using a modi- 
fied method described previously'®. BMMs were plated onto Mat-tek coverslip 
dishes (MatTek Corp.) and incubated overnight. BMMs were stimulated with viable 
or gentamicin-killed red fluorescent protein (RFP)-expressing thyA  E. coli in the 
presence of 0.167 mg ml’ Fdx in 200 ull of medium. After 120 min of co-culture, 
additional Fdx and gentamicin containing medium was added to the coverslip 
dishes to prevent drying and to prevent bacterial overgrowth. Cells were imaged 
after 2, 4 and 8 h to measure release of Fdx. Microscopic imaging was performed on 
an IX70 inverted microscope (Olympus) equipped with an X-cite 120 metal halide 
light source (EXFO) and excitation and emission filter wheels. Phase contrast and 
two fluorescent images were acquired for each field of cells. The fluorescent images 
used the same emission settings, but used different excitation band-pass filters. Fdx 
fluorescence intensity using an excitation filter centred at 440 nm is relatively 
insensitive to pH, whereas fluorescence intensity using an excitation filter centred 
at 485 nm is very sensitive to pH. The ratio of fluorescence intensity at 485 nm 
divided by 440 nm was converted to into pH maps using calibration curves gener- 
ated by imaging BMMs with Fdx-containing compartments at a series of fixed pH 
conditions. As described previously"®, pixels with pH above 5.5 were designated as 
representing Fdx which has been released from endolysosomal compartments. The 
percentage of Fdx release was calculated by dividing the total intensity of pixels 
containing released Fdx by the total Fdx intensity for each cell. 

Infections and vaccinations. For measurements of systemic cytokine levels, 
C57BL/6] wild-type, Trif-'", Asc ‘~ or Nirp3'~ mice were injected with 
1 X 10° viable or 5 X 10° heat-killed thyA” E. coli, respectively. Blood samples were 
drawn 6h after infection, and cytokine concentrations were measured by ELISA. 
For determination of bacterial clearance, we infected mice with 1 X 10° viable 
replication-sufficient E. coli by intraperitoneal injection. Mice were monitored daily 
and moribund animals were killed according to humane criteria established and 
approved by our institutional IACUC committee. After 60h, animals were killed 
and the spleens were explanted, homogenized, serially diluted and plated on LB- 
agar plates overnight followed by colony forming units (c.f-u.) counting. 

For vaccinations, we followed a prime-boost regimen as shown in the schematic 
in Fig. 4h that was adopted from a previous study”. In brief, mice received an 
initial vaccination intraperitoneally with 5 X 10’ c.fu. of viable or heat-killed 
thyA~ E. coli or a combination of 5 X 10’ c.f.u. heat-killed thyA~ E. coli and 
30 tg of purified E. coli total RNA, followed by two boosts (5 X 10° c.f.u.) after 
10 and 20 days. Polyclonal class-specific anti-E. coli antibody production was 
measured in the serum after 25 days by ELISA. 

Bacterial RNA. Total bacterial RNA was isolated from thyA _ E. coli using the e.z.n.a. 
Bacterial RNA Kit (Omega Bio-Tek), following the manufacturer’s instructions. 
Contaminating DNA was removed by DNase digestion (TURBO DNase, 
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Ambion/Applied Biosystems). Alternatively, total purified E. coli (DH5a) RNA 
was purchased from Ambion/Applied Biosystems, and similar results were 
obtained. Fractionation of bacterial RNA species was performed as follows. 
First, ribosomal 16S and 23S RNA (rRNA) was removed by a magnetic bead-based 
capture hybridization approach using the MICROBExpress kit (Ambion/Applied 
Biosystems). The enriched RNA was then separated into messenger RNA (mRNA) 
and small RNA (sRNA, including 5S rRNA) using the MEGAClear kit (Ambion/ 
Applied Biosystems). All separated RNA fractions were precipitated with ammo- 
nium acetate and re-suspended in nuclease-free water. RNA concentration and 
purity were determined by measuring the absorbance at 260/280 and 260/230 nm. 
RNA preparations were further visualized by 1% agarose gel electrophoresis. 

In vitro RNA transcription. The E. coli Gro operon encoding the bacterial 
chaperonins GroEL and GroES, the GTPase Era operon or the DNA polymerase 
III operon were PCR amplified from genomic DNA isolated from thyA  E. coli 
using primer pairs containing a T7 promoter sequence (T7) in either the FW or 
both FW and RV primer. Gro-FWT7 5'-TAATACGACTCACTATAGGGCACC 
AGCCGGGAAACCACG-3’; Gro-RVT7 5'-TAATACGACTCACTATAGGAA 
AAGAAAAACCCCCAGACAT-3’; Gro-RV 5'-AGATGACCAAAAGAAAAA 
CCCCCAGACATT-3’; Era-FWT7 5'-TAATACGACTCACTATAGGGCATA 
TGAGCATCGATAAAAGTTAC-3’; Era-RV_ 5'-TTTAAAGATCGTCAACGT 
AACCGAG-3'; DNApol-FWT7 5’-TAATACGACTCACTATAGGGATGTCTG 
AACCACGTTTCGT-3’; DNApol-RV 5'-AGTCAAACTCCAGTTCCACCTGC 
TCCGAA-3’. 

PCR fragments were purified using the Nucleospin Extract II PCR purification 
kit (Macherey-Nagel), and used as DNA templates for in vitro transcription. In 
vitro transcription was performed using the MEGAscript kit T7 (Ambion/Applied 
Biosystems) following the manufacturer’s instructions. DNA templates generated 
with Gro-FWT7 and Gro-RV primers only contained a T7 promoter site in the 
sense strand and yielded single-stranded RNA, whereas PCR templates generated 
with Gro-FWT7 and Gro-RVT7 primers contained T7 promoter sequences in 
both strands, allowing transcription of two complementary strands, yielding 
double-stranded RNA. For generation of 5’-capped RNA, m7G(5’)ppp(5’)G 
cap analogue (Ambion/Applied Biosystems) was included in the transcription 
reaction at a GTP:cap ratio of 1:4. 

RNA digestion, dephosphorylation and polyadenylation. [n-vitro-transcribed 
Gro RNA, total E. coli RNA or E. colimRNA were digested using RNase I (Promega) 
and RNase III (Ambion/Applied Biosystems). To remove 5'-triphosphates, RNA 
dephosphorylation was performed by incubating 10 jig in-vitro-transcribed RNA 
or total E. coli RNA or 1 mg of E. coli mRNA with 30 U of calf intestinal alkaline 
phosphatase (CIP, New England Biolabs) for 2 h at 37 °C, as described previously”’. 
Polyadenylation of in-vitro-transcribed and purified bacterial mRNA was per- 
formed using the poly(A) Tailing kit (Ambion) following the manufacturer’s 
instructions. 

Transfection of macrophages and MEFs. For direct cytosolic delivery of total 
purified E. coli RNA or in-vitro-transcribed Gro RNA, 5 X 10° BMMs or 2 X 10° 
MEFs were transfected with 1mg of RNA using 21 of Lipofectamine 2000 
(Invitrogen) in 24- or 12-well plates, respectively. 

Soluble ligands, inhibitors and other reagents. Lipopolysaccharide (LPS) was 
purchased from Sigma (E. coli 055:B5, phenol extracted). Caspase inhibitors 
z-YVAD, z-IEDT, Q-VD-OPH (all SM Biochemicals) were used at 50 uM, and 
added 30 min before stimulation of cells. 

Statistical analysis. Statistical significances were tested by an ANOVA Kruskall- 
Wallis test and Bonferroni-Dunn post hoc correction. Significances are repre- 
sented in the figures as follows: *, P= 0.05; **, P=0.01; ***, P= 0.001. NS, not 
statistically significant; hash symbol, not detected. 


31. Torchinsky, M. B., Garaude, J., Martin, A. P. & Blander, J. M. Innate immune 
recognition of infected apoptotic cells directs T417 cell differentiation. Nature 458, 
78-82 (2009). 

32. Blander, J. M. & Medzhitov, R. Regulation of phagosome maturation by signals 
from toll-like receptors. Science 304, 1014-1018 (2004). 

33. Sutterwala,F.S. etal. Critical role for NALP3/CIAS1/Cryopyrin in innate and adaptive 
immunity through its regulation of caspase-1. /mmunity 24, 317-327 (2006). 

34. Kuida, K. et al. Altered cytokine export and apoptosis in mice deficient in 
interleukin-1B converting enzyme. Science 267, 2000-2003 (1995). 

35. Sato, M. etal. Distinct and essential roles of transcription factors IRF-3 and IRF-7 in 
response to viruses for IFN-«/B gene induction. Immunity 13, 539-548 (2000). 

36. Maurelli, A. T., Baudry, B., d’Hauteville, H., Hale, T. L. & Sansonetti, P. J. Cloning of 
plasmid DNA sequences involved in invasion of HeLa cells by Shigella flexneri. 
Infect. Immun. 49, 164-171 (1985). 

37. Haraga, A. Ohlson, M. B. & Miller, S. |. Salmonellae interplay with host cells. Nature 
Rev. Microbiol. 6, 53-66 (2008). 

38. Schnupf, P. & Portnoy, D. A. Listeriolysin O: a phagosome-specific lysin. Microbes 
Infect. 9, 1176-1187 (2007). 

39. Gentleman, R. C. et al. Bioconductor: open software development for 
computational biology and bioinformatics. Genome Biol. 5, R80 (2004). 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


. Bolstad, B. M., Irizarry, R.A. Astrand, M. & Speed, T. P. A comparison of 


normalization methods for high density oligonucleotide array data based on 
variance and bias. Bioinformatics 19, 185-193 (2003). 


. lrizarry, R. A. et al. Summaries of Affymetrix GeneChip probe level data. Nucleic 


Acids Res. 31, e15 (2003). 


. Dai, M. et al. Evolving gene/transcript definitions significantly alter the 


interpretation of GeneChip data. Nucleic Acids Res. 33, e175 (2005). 


. Saeed,A.|.etal. TM4: a free, open-source system for microarray data management 


and analysis. Biotechniques 34, 374-378 (2003). 


. Saeed, A. |. et al. TM4 microarray software suite. Methods Enzymol. 411, 134-193 


(2006). 


. Sartor, M.A. etal. Intensity-based hierarchical Bayes method improves testing for 


differentially expressed genes in microarray experiments. BMC Bioinformatics 7, 
538 (2006). 


46. 
47. 
48. 
49. 
50. 


51. 


Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. 
Natl Acad. Sci. USA 100, 9440-9445 (2003). 

Samarajiwa, S. A., Forster, S., Auchettl, K. & Hertzog, P. J. INTERFEROME: the 
database of interferon regulated genes. Nucleic Acids Res. 37, D852-D857 (2009). 
Pahl, H. L. Activators and target genes of Rel/NF-«B transcription factors. 
Oncogene 18, 6853-6866 (1999). 

Coll, R. C. & O'Neill, L.A. New insights into the regulation of signalling by toll-like 
receptors and nod-like receptors. J. Innate /mmun. 2, 406-421 (2010). 

Lim, S. Y., Bauermeister, A., Kionaas, R. A. & Ghosh, S. K. Phytol-based novel 
adjuvants in vaccine formulation: 2. Assessment of efficacy in the induction of 
protective immune responses to lethal bacterial infections in mice. J. Immune 
Based Ther. Vaccines 4, 5 (2006). 

Hornung, V. et al. 5'-Triphosphate RNA is the ligand for RIG-I. Science 314, 
994-997 (2006). 


©2011 Macmillan Publishers Limited. All rights reserved 


Bsa ls 


doi:10.1038/nature10006 


Reprogramming transcription by distinct classes of 
enhancers functionally defined by eRNA 


Dong Wang"*, Ivan Garcia-Bassets~**, Chris Benner", Wenbo Li’, Xue Su?*, Yiming Zhou’, Jinsong Qiu, Wen Liu’, 
Minna U. Kaikkonen!, Kenneth A. Ohgi*, Christopher K. Glass', Michael G. Rosenfeld” & Xiang-Dong Fu' 


Mammalian genomes are populated with thousands of transcrip- 
tional enhancers that orchestrate cell-type-specific gene expression 
programs’, but how those enhancers are exploited to institute 
alternative, signal-dependent transcriptional responses remains 
poorly understood. Here we present evidence that cell-lineage- 
specific factors, such as FoxA1, can simultaneously facilitate and 
restrict key regulated transcription factors, exemplified by the 
androgen receptor (AR), to act on structurally and functionally 
distinct classes of enhancer. Consequently, FoxA1 downregula- 
tion, an unfavourable prognostic sign in certain advanced prostate 
tumours, triggers dramatic reprogramming of the hormonal res- 
ponse by causing a massive switch in AR binding to a distinct 
cohort of pre-established enhancers. These enhancers are func- 
tional, as evidenced by the production of enhancer-templated 
non-coding RNA (eRNA’°) based on global nuclear run-on sequen- 
cing (GRO-seq) analysis®, with a unique class apparently requiring 
no nucleosome remodelling to induce specific enhancer-promoter 
looping and gene activation. GRO-seq data also suggest that 
liganded AR induces both transcription initiation and elongation. 
Together, these findings reveal a large repository of active enhancers 
that can be dynamically tuned to elicit alternative gene expression 
programs, which may underlie many sequential gene expression 
events in development, cell differentiation and disease progression. 

The wide diversity of mammalian cells is determined by a large 
repertoire of constitutive and inducible genes, which are regulated by 
general and cell-type-specific transcription factors and cofactors 
through regulatory genomic elements”. Recent studies reveal that gene 
promoters are marked by tri-methylated H3K4 (H3K4me3) and distal 
regulatory elements are often associated with mono-methylated H3K4 
(H3K4mel1)’”. Because these H3K4mel-positive, H3K4me3-negative 
regions exhibit striking cell-type specificity'’, we used this signature to 
characterize potential enhancers in prostatic LNCaP cells in which one 
of key regulatory transcriptional programs is mediated by the AR. We 
identified by chromatin immunoprecipitation (ChIP)-sequencing 
14,283 H3K4me3-marked and 51,544 H3K4mel-marked loci in 
androgen (5a-dihydrotestosterone, (DHT))-treated LNCaP cells, 
among which 43,565 loci are uniquely marked by H3K4mel, largely 
localized distal to annotated transcriptional start sites (TSSs) (94%), 
and associated with other marks linked to enhancer activities (Fig. 1a). 

De novo DNA motif analysis revealed several highly enriched 
motifs, particularly the forkhead motif (Fig. 1b). Using a specific 
antibody against FoxAl, a major FOX family member expressed in 
LNCaP cells and normal prostate gland’"'' (Supplementary Fig. 1), we 
identified 33,426 FoxAl-bound sites, which extensively overlap with 
distal H3K4mel-marked regions (Fig. 1c and Supplementary Fig. 2a; 
see on KLK3 enhancer’ in Supplementary Fig. 2b). RNA profiling 
supports the functional relevance of these FoxA1/H3K4mel loci, 
as genes responsive to FOXA1 short interfering RNA (siRNA) are 


located more proximally to FoxAl/H3K4mel-marked loci than 
non-responsive genes (Fig. 1d and Supplementary Fig. 3). 

FoxA1 has been characterized as a ‘pioneer’ factor to facilitate DNA 
binding by other sequence-specific transcription factors”’**° and ‘trans- 
late’ H3K4mel/me2 into AR-mediated gene expression’. Comparing 
the profile of H3K4mel and H3K27ac before and after FOXA1 knock- 
down, we detected three classes of FoxA1-binding sites based on the 
H3K4mel signal exhibiting reduced (~22%), relatively unaffected 
(~74%) or even increased (~3.4%) levels over candidate enhancers 
(Fig. le-g and Supplementary Fig. 4). RNA profiling analysis agrees with 
the functional significance of these selective FoxA1 effects, revealing 
more downregulated genes in the first class, roughly equal numbers of 
up- or downregulated genes in the second and more upregulated genes 
in the third (Fig. 1h), suggesting a contribution of FoxA1 to ‘writing’ and 
‘reading’ the ‘histone code’ on different enhancer cohorts, in line with its 
critical function in prostate gland development'*”. 

The rationale for our experimental strategy to use RNA interference 
(RNAi) to study FoxA1-regulated enhancer network is the association 
of decreased FOXAI expression with castration-resistant, poor pro- 
gnostic prostate tumours (Supplementary Fig. 5). In LNCaP cells, 
FOXA1 RNAienhanced cell entrance to S phase with reduced hormone 
(Fig. 2a). To understand the mechanistic basis for elevated hormone 
responsiveness, we mapped AR-binding sites, identifying 3,115 high 
confident loci with approximately 65% co-incident with H3K4mel. 
De novo motif analysis revealed highly enriched elements for both 
AR and FoxA1, including a composite motif consisting of a FOX motif 
and AR regulatory element (ARE) half site, suggesting ternary complex 
formation on these sites (Fig. 2b). Indeed, 1,684 AR-bound loci (54% of 
total) are co-occupied by FoxAl in DHT-treated LNCaP cells and 
FoxA1 appears to bind to most of these sites (~70%) before hormone 
treatment (Supplementary Fig. 6). 

The conundrum is that, although FoxA1 is known to facilitate AR 
binding on several DHT-responsive genes’, FOXAI RNAi actually 
markedly elevated, rather than diminished, the DHT response 
(Fig. 2a). We found that approximately 60% of the original AR binding 
events were ‘expectedly’ lost in response to FOXA1 RNAi, which we 
refer to as the ‘lost’ AR program (Fig. 2c, d). We refer to the remaining 
approximate 40% of AR binding events as the ‘conserved’ AR program, 
which often exhibited enhanced AR binding. Strikingly, we detected a 
massive gain of 10,869 new AR binding loci, referred to as the ‘gained’ 
AR program (Fig. 2c, d). We extensively validated each of these AR 
programs by conventional ChIP-quantitative PCR (qPCR) (Fig. 2e). 
This induced AR reprogramming appears to be qualitatively and 
quantitatively distinct from reported AR re-targeting on androgen- 
resistant LNCaP-abl cells compared with parental LNCaP cells'” and 
is in sharp contrast to FoxAl-dependent genomic targeting of the 
oestrogen receptor- (ER-«) in breast cancer MCEF7 cells'®. In concert 
with such massive AR reprogramming, we observed corresponding 
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Figure 1 | FoxA1 contributes to the enhancer code in prostate cancer cells. 
a, Distribution of histone marks within +2-kb windows around distinct 
genomic regions (n = 43,565) marked by H3K4mel, but not H3K4me3, in 
androgen (DHT)-stimulated LNCaP cells. The ChIP-seq data sets for 
H3kK4mel, H3K4me2, H3K4me3, H3K27ac, H4K5ac and p300 were each 
aligned with respect to the centre of the H3K4mel signal and sorted by the 
length of H3K4mel1-marked regions. b, Top-enriched DNA motifs with 
significant P values and prospective families of DNA binding transcription 
factors identified by de novo motif analysis of non-promoter regions marked by 
H3K4mel. c, Percentage of H3K4mel-marked regions that show FoxA1 
binding events (top panel) and percentage of FoxA1-binding sites that are 
marked by H3K4mel (bottom panel). Note that H3K4mel-marked regions 
tend to be broad, but FoxA1-binding sites are discrete; as a result, many 
H3K4mel-positive regions may contain more than one FoxA1-binding site. 
d, Genomic distance from FoxA1/H3K4mel -positive loci to the nearest TSS of 
genes in response to FOXA1 knockdown. Outliers were omitted from box plots. 
P values indicate the significance in pair-wise comparisons. e-g, Three classes 
of FoxA1/H3K4mel-positive loci according to the response in levels of 
H3K4mel to FOXA1 knockdown: greater than 1.5-fold decrease (e), no 
significant change (f) and greater than 1.5-fold increase (g). h, Ratio (log) of 
up- and downregulated genes in each H3K4mel-responsive category in 

e-g. CTL, control. 


changes in gene expression in each of three AR programs (Fig. 2f, g and 
Supplementary Fig. 7). The newly induced AR expression program is 
also linked to AR binding events (Fig. 2h), suggesting a direct gain-of- 
function on DHT-responsive genes, as illustrated on SOX9 and other 
genes (Supplementary Fig. 8), which have been previously documen- 
ted to play critical roles in cancer progression’””*. Because we also 
observed an approximate threefold elevation of AR expression in 
FOXA1 RNAi-treated cells (Supplementary Fig. 9a), we tested the 
possibility that increased AR expression might trigger these effects. 
We found that AR overexpression alone was insufficient to induce 
AR reprogramming (Supplementary Fig. 9b). 

To explore the mechanism for AR reprogramming, we determined 
FoxA1 binding on different AR programs. We found that the gained 
AR program is largely devoid of FoxA1, whereas FoxA1 is present in 
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more than half of the lost and conserved AR programs (Supplementary 
Fig. 10). This raises the possibility that FoxA1 may facilitate AR bind- 
ing to its original binding program, but trans-repress AR from binding 
to other genomic regions that lack FoxA1-binding sites in the gained 
program, a strategy frequently used by other transcription activators”’. 


Figure 2 | AR reprogramming and induced 
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Indeed, as previously reported”, FoxAl overexpression squelched 
ARE-driven transcription in transfected HEK293 cells (Supplemen- 
tary Fig. 11), which is consistent with the ability of AR to interact with 
FoxA1 directly”. This mechanism appears to be exploited during 
tumour progression because an AR mutation identified in advanced 
prostate tumours lacks part of the hinge domain important for inter- 
actions with FoxA1, its ability to interact with FoxAl, and became 
resistant to FoxAl-mediated trans-repression (Supplementary Fig. 
11b, c). Furthermore, our functional analysis indicates that the missing 
AR ligand-binding domain also contributes to AR:FoxA1 interactions 
(Supplementary Fig. 12). Interestingly, similar AR truncations have 
also been reported to result from alternative splicing, gene rearrange- 
ment and/or calpain-mediated cleavage (Supplementary Fig. 13). 
Based on these findings, we propose that FoxA1 regulates AR genomic 
targeting by simultaneously anchoring AR to cognate loci and restrict- 
ing AR from other ARE-containing loci in the human genome. 

To understand how reprogrammed AR binding is translated to 
altered hormonal response, we took advantage of the recently estab- 
lished GRO-seq® to detect the functional relationship between AR 
binding and hormone-induced gene expression. This powerful 
genome-wide interrogation of ongoing transcription detected a broad 
scope of nascent RNAs. We uncovered 28,318 transcripts with 15,656 
annotated and 12,662 unannotated transcripts, among which 450 cod- 
ing and 347 unannotated transcripts were induced more than 1.5-fold 


with even just 1h DHT treatment (Supplementary Fig. 14). The TSSs 
of GRO-seq defined transcripts are typically marked by H3K4me3 and 
H3K27ac (Supplementary Fig. 15a, b). Importantly, GRO-seq also 
detected non-coding RNAs from a subset of H3K4mel-positive, 
H3K4me3-negative regions (Supplementary Fig. 15c). As illustrated 
on the enhancer of the KLK3 transcription unit (Fig. 3a), these eRNAs 
are largely symmetrical and bidirectional (see additional examples on 
other well-known hormone regulated genes, such as PMEPAI and 
KLK2 in Supplementary Fig. 16). Interestingly, we often detected a 
large amount of nascent RNA before DHT treatment, particularly near 
their TSSs (for example, KLK3); DHT not only enhanced the expres- 
sion of these nascent RNAs, but also allowed the extension of tran- 
scription towards the end of the gene (Fig. 3a and Supplementary Fig. 
16). We estimated that approximately 79% of the transcription units 
induced by liganded AR are regulated at the level of transcriptional 
initiation, whereas approximately 21% appear to be primarily regu- 
lated at the level of elongation (Supplementary Fig. 17). 

The ability to detect regulated eRNA expression allowed us to ana- 
lyse different AR programs during transcriptional reprogramming. In 
the presence of FoxA1, DHT enhanced eRNA expression from AR- 
bound enhancers in both the lost and conserved AR programs. In 
contrast, a basal level of eRNAs was detectable on the gained program, 
but was independent of the hormone treatment, indicating that these 
are pre-established enhancers (Fig. 3b). In response to FOXA1 RNAi, 
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Figure 3 | Transcriptional response on individual enhancer programs to 
FOXA1 downregulation. a, Display of nascent RNA detected by GRO-seq on 
the KLK3 locus. The DHT-induced AR binding is shown at bottom as a 
reference. b, c, Induction of eRNA by DHT (b) or FOXA1 knockdown in DHT- 
treated LNCaP cells (c). The eRNA levels under different conditions (indicated 
at bottom) are separately displayed on three AR-binding programs. d, e, Effects 
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of FoxA1 on binding of p300 (d) and Med12 (e) in each AR program in DHT- 
treated LNCaP cells. f, g, Long-distance interaction between gene promoter and 
AR-bound site was determined by the 3C assay on two representative gene loci 
selected from the conserved and gained AR programs. Negative controls at 
shorter distances and a positive control with the corresponding BAC in the 
region are included in each case. 
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the expression of eRNAs was diminished from the lost program, but 
modestly or dramatically enhanced from the conserved and gained 
programs, respectively (Fig. 3c). The DHT-induced nascent tran- 
scripts (detected by GRO-seq) and steady-state RNAs (detected by 
microarrays) best predict direct target genes by liganded AR, as they 
show the shortest distance (<50 kilobases (kb)) to nearby AR-binding 
sites compared with genes identified by either criterion alone (Sup- 
plementary Fig. 18), indicating that AR-activated enhancers marked 
by increased eRNA are responsible for activation of nearby coding 
transcription units. 

In concert with differential eRNA expression, we also observed 
corresponding changes in levels of another mark in the final step of 
enhancer activation’, specifically p300, on both conserved and gained 
AR programs (Fig. 3d). Interestingly, enhancers in the lost AR pro- 
gram continued to exhibit significant p300 binding, even after AR 
binding and eRNA expression were diminished in FOXA1 knockdown 
cells (Fig. 3c, d). The transcription mediator Med12 has recently been 
suggested to mediate enhancer—promoter looping”. We tested Med12 
binding on individual AR programs, finding that it exhibited an ident- 
ical binding pattern to p300 (Fig. 3e). Enhanced Med12 binding on the 
conserved and gained programs after FOXA1 knockdown suggests 
elevated or newly activated enhancer-promoter interactions. This 
was demonstrated by the 3C assay on two representative genes where 
FOXA1 knockdown either enhanced (on the FASN locus from the 
conserved AR program) or create new (on the NDRG1 locus in the 
gained AR program) long-range interactions between AR-bound 
enhancers and specific gene promoters in DHT-treated cells (Fig. 3f, 
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g and Supplementary Fig. 19). These data strongly suggest that the 
induction of eRNAs, rather than binding of either p300 or Med12, is 
the most precise mark of the final, functional looping between an 
activated enhancer and its regulated gene promoter. 

Addressing the structural basis for different functional classes of AR 
enhancers, we note that the distinct profiles of H3K4mel and H3K27ac 
on the lost, conserved and gained AR programs and FOXA1 RNAi had 
little effect on these profiles (Fig. 4a, b and Supplementary Fig. 20). The 
histone marks H3K4mel and H3K27ac around the lost and conserved 
AR programs exhibit a bimodal distribution, which is particularly pro- 
nounced on the lost program (Fig. 4a, bottom panel). The DNA- 
binding sites in the lost AR program are actually significantly less 
enriched in canonical AREs, which may render AR binding on these 
sites particularly dependent on FoxA1, whereas both the conserved and 
gained AR programs are associated with nearly perfect palindromic, 
canonical AREs (Supplementary Fig. 21), explaining why AR is able to 
target those sites in a FoxAl-independent manner. Strikingly, the 
gained AR-binding sites are coincident with sharp H3K4mel and 
H3K27ac peaks (Fig. 4a, b, middle panels), suggesting a distinct nucleo- 
some architecture underlying the gained AR program. 

A recent study has suggested that AR binding leads to dynamic 
dismissal of a central, H2A.Z-containing nucleosome, being replaced 
by two flanking H3K4me2-marked nucleosomes”. We found that the 
lost AR program was largely devoid of a ‘central’ nucleosome even 
before AR binding (Fig. 4c, bottom panel). The conserved AR program 
exhibited DHT-induced switch from the central H3K4me2-marked 
nucleosome to two flanking H3K4me2-marked nucleosomes, which 
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Figure 4 | Distinct classes of AR enhancers in the human genome. 

a, b, Profiles of H3K4mel (a) and H3K27ac (b) associated with the lost (bottom 
panels), conserved (top panels) and gained (middle panels) AR programs in 
DHT-treated LNCaP cells in response to FOXA1 knockdown. ¢, d, Profiles of 
H3K4me2 around AR binding loci at the nucleosomal resolution in response to 
DHT stimulation in control siRNA-treated (c) or FOXA1 siRNA-treated 

(d) LNCaP cells. e, Profiles of the histone variant H2A.Z on the three different 
AR programs. f, Model for FoxA1-mediated AR targeting and reprogramming 
in LNCaP cells. In class I (the lost AR program), FoxA1 licenses liganded AR to 
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bind to ARE in relatively nucleosome-free regions. AR binding does not induce 
nucleosome remodelling in this class of enhancers. In class II (the conserved AR 
program), AR binds independently of FoxA1 to ARE, inducing nucleosome 
remodelling. In class III (the gained AR program), FoxA1 restricts AR binding, 
despite the presence of strong AREs. Although pre-established, these gained 
loci exhibit a strong central nucleosomes and are associated with H2A.Z, which 
is not affected by AR binding. FOXA1 knockdown converted these sites to 
androgen-responsive sites. In all these three classes, eRNAs were generated or 
increased after AR binding. e:p, enhancer:promoter. 
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is largely independent of FoxA1 (Fig. 4c, d, top panels). The gained 
program showed a strong H3K4me2-marked central nucleosome both 
before and after AR binding (Fig. 4c, d, middle panel). Thus, this 
gained AR program represents a new type of enhancer topography 
that requires no nucleosome remodelling for enhancer recognition 
and subsequent enhancer-promoter interactions. H2A.Z is preva- 
lently associated with the gained AR program, modestly with the con- 
served AR program and absent in the lost AR program (Fig. 4e). 
Together, these findings establish distinct chromatin structures under- 
lying functionally distinct classes of AR enhancer. 

In summary, our findings imply a general principle for establishing 
cell-type-specific transcription programs. Cell-lineage-specific factors 
(such as FoxA1) coupled with other general transcriptional factors 
‘create’ a cell-type-specific enhancer network, allowing other regulated 
factors (such as AR) to ‘activate’ these pre-established enhancers 
(Fig. 4f). The enhancer activation process is tightly linked to eRNA 
production, which appear to serve as a more robust indicator of 
enhancer activities than any enhancer-bound transcription activators 
or chromatin marks. On the current biology model, AR reprogram- 
ming dramatically altered the androgen-responsive pathway, which, 
according to GO analysis (Supplementary Fig. 22 and Fig. 23), may 
contribute to enhanced cell growth and the establishment of an 
appropriate microenvironment in advanced prostate cancer*®**. 
Together, these findings provide a conceptual framework to under- 
stand complex gene-expression switching events, as occurs during 
disease progression and development. 


METHODS SUMMARY 


Experiments were performed on LNCaP cells, LNCaP-AR cells (gift of C. Sawyers) 
and HEK293 cells. ChIPs were done as previously described” and GRO was per- 
formed as described*”®. Control siRNA was purchased from Qiagen (1027280). 
FOXAI siRNA 1 (M-010319) and 2 (sense 5'-GAGAGAAAAAAUCAACAGC-3’; 
antisense 5'-GCUGUUGAUUUUUUCUCUC-3’)’ were purchased from or synthe- 
sized by Dharmacon. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Antibodies. Specific antibodies were purchased from the following commercial 
sources: anti-FoxA1 (ab5089), anti-H3K4mel (ab8899), anti-H3K27ac (ab4729), 
anti-H3K36me3 (ab9050) and H2A.Z (ab4174) from Abcam; anti-AR (N-20), 
anti-FoxAl (C-20) and p300 (C-20, gift of B. Ren) from Santa Cruz 
Biotechnology; anti-H3K4me2 (07-030), anti-H3K4me3 (07-473), anti-H4K5ac 
(07-327) and anti- H3K27me3 (07-449) from Upstate Biotechnology; anti-Med12 
(A300-774A) from Bethyl Laboratories; and anti-beta Actin (AC74) from Sigma. 
siRNA transfection. One day before transfection, LNCaP cells were seeded in 
RPMI 1640 medium with 10% FBS. Six hours after siRNA transfection (20 pmol 
ml?) with Lipofectamine 2000 (Invitrogen), cells were washed twice with PBS 
and then maintained in hormone-deprived phenol-free RPMI 1640 media. For 
gene expression profiling and western blotting, cells were cultured for 3 days after 
transfection and then treated with DHT for 20h; for ChIP-qPCR and ChIP-seq, 
cells were cultured for 4 days after transfection and then treated with DHT for 1h. 
ChIP-seq analysis at the nucleosome resolution was based on cells treated with 
DHT for 4h. 

3C assay. Cells were crosslinked with 1% formaldehyde for 20 min at room tem- 
perature and processed according to the standard 3C protocol’. For the study on 
the FASN locus, fixed chromatin from 5 X 10° cells was digested with 400 units of 
BglII and EcoR I (NEB). For the NDRG1 locus, fixed chromatin from 5 X 10° cells 
was digested with 400 units of HindIII (NEB). Ligation was done with 800 units of 
T4 DNA ligase (NEB) for 4h. The 3C product was quantified by qPCR after 
diluting the template tenfold compared with purified genomic DNA of known 
concentration. For each semi-quantitative PCR, the amount of template was 
titrated to determine the linear range in which the PCR product was amplified. 
PCR primers were designed next to BglII and HindIII restriction sites, respectively, 
for the FASN (all in minus strand) promoter (5’-AAGCTGTGAGTCAGCAT 
GGTAG-3’) and three upstream sites (—38kb, 5'-TGTCTTCTGATGTGTCTG 
CTTAGAG-3'; —45kb, 5'-AATCCTGCTCAGGAATCTGTATGT-3’'; —54kb, 
5'-GGACACTACTGCTTTTTCCTGTG-3’) and for the NDRGI (all in plus 
strand) promoter (5'-ATAGGTTCTGCCTTATTAGGG-3’) and three upstream 
sties (—42kb, 5’-ATAGAGTTAGAGAAACGGAGGCAGT-3’; —56kb: 5’-GCC 
GTGAAGAATAAACAAGATGAG-3'; —62kb: 5’-ACACATTTTGTTCCCAG 
TGCAG-3’). 

Co-IP and western blotting analysis. HEK293 cells were seeded for 1 day, trans- 
fected with the expression plasmids expressing wild-type, mutant AR and FoxA1 
using Lipofectamine2000 (Invitrogen) and then changed to hormone-depleted, 
phenol-free DMEM medium. One day after plasmid transfection, cells were 
treated with 100nM DHT for another day. Cells were washed by cold PBS twice 
and treated with 1 ml of lysis buffer (50 mM Tris pH 8.0, 150 mM NaCl, 1% NP- 
40) supplemented with a cocktail of proteinase inhibitors (Sigma) for 5 min at 
4°C. Lysed cells were collected, rotated for 1h at 4°C and cell debris removed by 
centrifugation at 18,000g for 30 min in a cold room. The supernatant was incu- 
bated with anti-AR, anti-FoxA1 or immunoglobulin-G overnight at 4 °C followed 
by the addition of 50 pl of 50% protein G beads to each tube. After rotating for 
another 2h at 4°C, the beads were washed five times with the lysis buffer, twice 
with cold PBS, and boiled for 6 min in 40 pil of 2 X SDS loading buffer. Western 
blotting analysis was performed with anti-AR or anti-FoxA1. 

Luciferase reporter assay. PC3-AR and HEK293 cells were seeded into 24-well 
plates in hormone-depleted and phenol-free RMPI 1640 medium and DMEM 
1 day before transfection. Transfection was performed according to the manufac- 
turers’ recommendations (DOTAP Liposomal Transfection Reagent from Roche 
or Lipofectamin 2000 from Invitrogen). One day after transfection, these cells were 
treated with DHT for an additional day. After washing with cold PBS twice, cells 
were treated with the lysis buffer (Promega) and the Luciferase signal was 
recorded. 

Cell proliferation assay. The assay was based on the published protocol. Briefly, 
LNCaP cells were transfected with control siRNA and FoxA1 siRNA (sequences 
listed in Methods Summary) and cultured in hormone-depleted medium for 
3 days. The cells were treated with different amount of DHT for another day. 
After the treatment, cells were washed by PBS, fixed by 70% EtOH and stored 
at —20°C for at least 2h. Before analysis, cells were washed with cold PBS, re- 
suspended at the propidium-iodide/Triton X-100 staining solution and incubated 
at 37°C for 15min. After removing cell clumps, stained cells were sorted on a 
Beckman FASCan, and the percentage of S-phase cells was calculated. 

ChIP and ChIP-seq analyses. ChIP was as previously described”. Briefly, 
approximately 10’ treated cells were crosslinked with 1% formaldehyde at room 
temperature for 15 min. After sonication, the soluble chromatin was incubated 
with 1-5 ug of antibody. Specific immunocomplexes were precipitated with 
Protein A/G beads (Sigma-Aldrich). Complexes were washed, DNA extracted 
and purified by QIAquick Spin columns (Qiagen). Extracted DNA (1 pl from 
60 pl) was used for qPCR with the specific PCR primers listed in Supplementary 
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Fig. 24, each of which was designed surrounding a specific region of 150-250 base 
pairs (bp) on target DNA. PCR products were detected with SYBR Green on a 
MX3000P System (Stratagene) and the percentage of immunoprecipitated chro- 
matin was calculated from ACt relative to immunoglobulin-G control after nor- 
malizing against input chromatin. For ChIP-seq, extracted DNA was ligated to 
specific adaptors followed by deep sequencing in the Illumina GAII system 
according to the manufacturer’s instructions. The first 25 bp for each sequence 
tag returned by the Illumina Pipeline was aligned to the hg18 assembly (National 
Center for Biotechnology Information, build 36.1) using Bowtie, allowing up to 
two mismatches. Only tags uniquely mapped to the genome were used for further 
analysis. The data were visualized by preparing custom tracks for the University of 
California, Santa Cruz, genome browser using HOMER”® (http://biowhat.ucsd. 
edu/homer). The total number of mappable reads was normalized to 10” for each 
experiment presented in this study. ChIP-seq at nucleosome resolution was per- 
formed as previously reported**. A summary of ChIP experiments is provided in 
Supplementary Fig. 25. 

Identification of ChIP-seq peaks. The identification of ChIP-seq peaks (bound 
regions) was performed using HOMER (http://biowhat.ucsd.edu/homer). For 
transcription factors, peaks were identified by searching locations of high read 
density using a 200-bp sliding window. Regions of maximal density exceeding a 
given threshold were called as peaks, and we required adjacent peaks to be at least 
500 bp away to avoid redundant detection. Only one tag from each unique position 
was considered to avoid clonal artefacts from the sequencing. The threshold for the 
number of tags that determined a valid peak was selected at a false discovery rate of 
0.001 determined by peak finding using randomized tag positions in a genome 
with an effective size of 2X 10° bp. We also required peaks to have at least fourfold 
more tags (normalized to total count) than input control samples. In addition, we 
required fourfold more tags relative to the local background region (10 kb) to avoid 
identifying regions with genomic duplications or non-localized binding. 

The peak finding procedure was modified to identify regions harbouring spe- 
cific histone modifications, as these experiments tend to yield broad areas of 
enrichment over several hundreds or thousands of base pairs. Seed regions were 
initially found using a peak size of 500 bp at the false discovery rate of 0.001 to 
identify enriched loci. Enriched loci found within 1 kb of one another were then 
merged to yield variable-length regions. Transcription factor peaks and histone 
modification regions were associated with gene products by identifying the nearest 
RefSeq TSS. Annotated positions for promoters, exons, introns and other features 
were based on RefSeq transcripts and repeat annotations from University of 
California, Santa Cruz. Peaks from separate experiments were considered equival- 
ent/co-bound if their peak centres were located within 200 bp of each other. Read 
density heat maps were created by first using HOMER to generate read densities 
and then visualized using Java TreeView (http://jtreeview.sourceforge.net). 
HOMER for de novo motif discovery and known motif enrichment. Motif 
discovery was performed using a comparative algorithm similar to those previously 
described’. An in-depth description will be published elsewhere (Benner et al., in 
preparation). Motif finding for transcription factors was performed on sequence 
from +100 bp relative to the peak centre, whereas motif finding for histone modi- 
fication regions was performed on sequence from +500 bp relative to the region 
centre. Briefly, sequences were divided into target and background sets for each 
application of the algorithm. Background sequences were then selectively weighted 
to equalize the distributions of G + C content in target and background sequences 
to avoid comparing sequences of different general sequence content. Motifs of 
length 8, 10, 12, 14, 16 and 18 bp were identified separately by first exhaustively 
screening all oligonucleotides for enrichment in the target set compared with the 
background set using the cumulative hypergeometric distribution to score enrich- 
ment. Up to three mismatches were allowed in each oligonucleotide sequence to 
increase the sensitivity of the method. The top 200 oligonucleotides of each length 
with the lowest P values were then converted into probability matrices and heur- 
istically optimized to maximize hypergeometric enrichment of each motif in the 
given data set. As optimized motifs were found they were removed from the data set 
to facilitate the identification of additional motifs in subsequent rounds. HOMER 
also screens the enrichment of known motifs previously identified through the 
analysis of published ChIP-ChIP and ChIP-Seq data sets by calculating the known 
motifs’ hypergeometric enrichment in the same set of G + C normalized sequences 
used for de novo analysis. Sequence logos were generated using WebLOGO (http:// 
weblogo.berkeley.edu). Motif enrichment heatmaps and dendrograms were created 
by clustering hypergeometric log P values using Cluster (http://bonsai.ims. 
u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#ctv) and Java TreeView 
(http://jtreeview.sourceforge.net). 

GRO-segq. Global run-on* and library preparation for sequencing” were done as 
described. Briefly, four 10-cm plates of confluent LNCaP cells per treatment were 
washed three times with cold PBS buffer. Cells were then swelled in swelling buffer 
(10 mM Tris pH 7.5, 2mM MgCh, 3 mM CaCl,) for 5 min on ice. Harvested cells 
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were re-suspended in 1 ml of the lysis buffer (swelling buffer with 0.5% IGEPAL 
and 10% glycerol) with gentle vortex and brought to 10 ml with the same buffer for 
extraction of nuclei. Nuclei were washed with 10 ml of lysis buffer and re- 
suspended in 1 ml of freezing buffer (50mM Tris pH 8.3, 40% glycerol, 5mM 
MgCl, 0.1 mM EDTA), pelleted down again and finally re-suspended in 100 ul of 
freezing buffer. 

For run-on assay, re-suspended nuclei were mixed with an equal volume of 
reaction buffer (10 mM Tris pH 8.0, 5mM MgCl, 1mM DTT, 300 mM KCI, 20 
units of SUPERase-In, 1% Sarkosyl, 500 14M ATP, GTP and Br-UTP, 2 1M CTP) 
and incubated for 5min at 30°C. Nuclei RNA were extracted with TRIzol LS 
reagent (Invitogen) according to the manufacturer’s instructions. RNA was then 
re-suspended in 20 til of DEPC-water and subjected to base hydrolysis by addition 
of 5 wl of 1 M NaOH and incubated on ice for 40 min. Then, 25 pil of 1 M Tris pH 
6.8 was added to neutralize the reaction. RNA was purified through a p-30 RNase- 
free spin column (BioRad), according to the manufacturer’s instructions and 
treated with 6.71 of DNase buffer and 101 of RQI1 RNase-free DNase 
(Promega), and purified again through a p-30 column. A volume of 8.5 pl 10 X 
antarctic phosphatase buffer, 1 jl of SUPERase-In and 5 ul of antarctic phospha- 
tase was added to the run-on RNA and treated for 1 h at 37 °C. Before proceeding 
to immunopurification, RNA was heated to 65 °C for 5 min and kept on ice. 

Anti-BrdU argarose beads (Santa Cruz Biotech) were blocked in blocking buffer 
(0.5 X SSPE, 1mM EDTA, 0.05% Tween-20, 0.1% PVP, and 1 mg ml! BSA) for 
1h at 4°C. Heated run-on RNA (~85 ul) was added to 60-1 beads in 500 pl 
binding buffer (0.5 X SSPE, 1 mM EDTA, 0.05% Tween-20) and allowed to bind 
for 1h at 4°C with rotation. After binding, beads were washed once in low salt 
buffer (0.2 SSPE, 1 mM EDTA, 0.05% Tween-20), twice in high salt buffer (0.5% 
SSPE, 1 mM EDTA, 0.05% Tween-20, 150 mM NaCl) and twice in TET buffer (TE 
pH 7.4, 0.05% Tween-20). BrU-incorporated RNA was eluted with 4 X 125 pl 
elution buffer (20mM DTT, 300mM NaCl, 5mM Tris pH 7.5, 1mM EDTA 
and 0.1% SDS). RNA was then extracted with acidic phenol/chloroform once, 
chloroform once and precipitated with ethanol overnight. The precipitated 
RNA was re-suspended in 50 pl reaction (45 ul of DEPC water, 5.2 ul of T4 
PNK buffer, 1 pl of SUPERase_In and 1 ttl of T4 PNK (NEB)) and incubated at 
37°C for 1h. The RNA was extracted and precipitated again as above. 

Complementary DNA (cDNA) synthesis was performed basically as described”® 
with minor modifications. First, RNA fragments were subjected to poly-A tailing 
reaction in 8.0 pl volume containing 0.8 ul poly-A polymerase buffer, 1 pl 1 mM 
ATP, 0.5 pl SUPERase-In and 0.75 ull poly-A polymerase (NEB). The reaction was 
performed for 30 min at 37 °C. Subsequently, reverse transcription was performed 
using oNT1223 primer (5'-pGATCGTCGGACTGTAGAACTCT;CAAGCAGA 
AGACGGCATACGATTTTTTTTTTTTITTTTTTTVN-3’) where the p indi- 
cates 5’ phosphorylation, ‘; indicates the abasic dSpacer furan and VN indicates 
degenerate nucleotides. 

Tailed RNA (8.0 ul) was mixed with 1 pl dNTP (10mM each) and 2.5 ul 
12.5 uM oNTI223, heated for 3 min at 75°C and chilled briefly on ice. Then, 
0.5 pl SUPERase-In, 3 pl 0.1M DTT, 2 pl 25mM MgCh, 2 ul 10 X reverse tran- 
scription buffer and 1 ul superscript III reverse transcriptase (Invitrogen) was 
added to the tube. The tube was incubated for 30 min at 48°C. After that, 4 ul 
of Exonuclease I (Fermentas) was added into the reaction and the tube was incu- 
bated for 1h at 37 °C to eliminate extra oNTI223. Then RNA was eliminated by 
adding 1.8.1 1M NaOH and incubated for 20 min at 98 °C. The reaction was 
neutralized with 1.8 pl of 1 M HCL After running on a 10% polyacrylamide TBE- 
urea gel, the extended first-strand cDNA product was excised and recovered by 
soaking the grinded gel in DNA gel elution buffer (TE with 0.1% Tween-20 and 
150 mM NaCl) overnight and then precipitated with ethanol. 

Circularization of first-strand cDNA was performed by re-suspending cDNA in 
9.5 ul reaction solution (7.5 pl water, 1 pl CircLigase buffer, 0.5 pl 1 mM ATP and 
0.5 pl 50 mM MnCl) and then adding 0.5 il CircLigase (Epicentre). The reaction 
went for 1h at 60°C and then was heat-inactivated for 20min at 80°C. 
Circularized single-stranded DNA (ssDNA) was relinearized by adding 3.8 ll of 
4X relinearization supplement (100 mM KCl, 2mM DTT) followed by 1.5 pl of 
Apel (15u, NEB). The reaction was incubated for 1 h at 37 °C. Relinearized ssDNA 
was separated in a 10% polyacrylamide TBE-urea gel (Invitrogen) as described 
above. The relinearized product band was excised (~120-300 bp) and the DNA 
was recovered as described above. 

The ssDNA template was amplified by PCR using the Phusion High-Fidelity 
enzyme (NEB) according to the manufacturer’s instructions. The oligonucleotide 
primers oNTI200 (5'-CAAGCAGAAGACGGCATA-3’) and oNTI201 (5'-AATG 
ATACGGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGACG-3’) were 
used to generate DNA for sequencing. PCR was performed with an initial 30-s 
denaturation at 98 °C, followed by 13 cycles of 10-s denaturation at 98°C, 15-s 
annealing at 60 °C and 15-s extension at 72°C. The PCR product was run on a 
non-denaturing 8% polyacrylamide TBE gel and recovered as mentioned before. 


DNA was then sequenced on the Illumina Genome Analyser II according to the 
manufacturer’s instructions with small RNA sequencing primer 5'-CGACAGG 
TTCAGAGTTCTACAGTCCGACGATC-3’. 

De novo identification of GRO-Seq transcripts. To identify transcription units 
in an unbiased manner, GRO-Seq read densities were analysed to classify genomic 
regions into contiguous transcripts using HOMER. GRO-Seq read densities were 
initially normalized using the GC content of individual reads to remove any 
systematic bias introduced by overall GC content variation between read libraries. 
To maximize read depth for transcript identification, all GRO-Seq libraries were 
merged to perform the initial transcript discovery, and later considered separately 
to identify regulated transcripts. For each strand of each chromosome, GRO-Seq 
read densities were calculated using a sliding window of 250 bp. Regions for which 
GRO-Seq reads could not be uniquely mapped (that is, repeats) were first iden- 
tified and then read densities in these regions were estimated using upstream read 
densities from mappable regions to avoid ending predicted transcripts prema- 
turely. Transcript initiation sites were identified as regions where the GRO-Seq 
read density increased threefold relative to the previous 1 kb region. Transcript 
termination sites were defined by either a reduction in reads below 10% of the start 
of the transcript or when another transcript’s start site occurred on the same 
strand. Single spikes in read density covering a span less than 250 bp were con- 
sidered artefacts and discarded. Identified transcripts were strand-specifically 
compared with RefSeq transcripts by looking for overlap in the transcribed region. 
Transcripts were defined as putative eRNAs if their TSS was located distal to 
RefSeq TSS (>3 kb) and were associated with H3K4mel regions. To identify 
differentially regulated transcripts, strand-specific read counts from each GRO- 
Seq experiment were determined for each transcript using HOMER”. EdgeR 
(http://www. bioconductor.org/packages/release/bioc/html/edgeR.html) was then 
used to calculate differential expression (>1.5-fold, <0.01 false discovery rate). 
Microarray and reverse-transcription qPCR analyses of gene expression. Total 
RNA was isolated with the RNeasy Mini Kit (Qiagen) and treated by RNase-free 
DNase I. For PCR with reverse transcription, first-strand cDNA synthesis from total 
RNA was performed with the Superscript III cDNA Synthesis System (Invitrogen). 
Microarray analysis was performed on Human V2 Chips (Illumina). The published 
gene expression profiling data GDS2545 (refs 36, 37) and GDS1439 (ref. 38) were 
extracted from the National Center for Biotechnology Information, normalized and 
P values calculated by two-tailed t-test. For validation by PCR with reverse tran- 
scription, cDNA was analysed with SYBR Green (Stratagene) on the Mcx300P 
System (Stratagene). The relative messenger RNA level was calculated by comparing 
with non-treatment control, after normalization with GAPDH or ACTB messenger 
RNA. The primers for RT-qPCR (5’-3') were as follows: ACTB-5, CGTCCCAGT 
TGGTGACGATG; ACTB-3, GCCGTCTTCCCCTCCATC; GAPDH-5, GITTT 
TCTAGACGGCAGGTCAGG; GAPDH-3, AACATCATCCCTGCCTCTACTGG; 
KLK3-5, TGTGTGCGCAAGTTCACCG; KLK3-3, GGTTCACTGCCCCATGAC; 
RASSF3-5, GACGCCGAGGACTTCTTCTT; RASSF3-3, TGCTGAGGTAACT 
GTGGGTTT; SOX9-5, GACTCGCCACACTCCTCCTC; SOX9-3, AAGTCGAT 
AGGGGGCTGTCT; IL6R-5, GAGATTCTGCAAATGCGACA; IL6R-3, GITGGG 
GAGATGAGAGGAACA; DNM2-5, TGTTTGCCAACAGTGACCTC; DNM2-3, 
CCCAGACCACTGAAGCTCCT. 

Survival analysis. Two independent sets of gene expression data were used to 
check the association between FoxA1 and clinical outcome of patients by Kaplan— 
Meier analysis. One data set came from 78 patients with prostate tumours (age 
<70)”, the other from 131 patients”. Significant association with outcome was 
determined by log-rank test for survival. Hazard ratios were calculated by the Cox 
proportional model. All statistics were analysed with the statistical software R 
(version 2.6.2), available from the R Project for Statistical Computing website 
(http://www.r-project.org). The cut-off was determined so that the log-rank P 
value was the smallest one in the cut-offs that went through the 5th-95th percen- 
tiles of signals. 
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Converting nonsense codons into sense codons by 
targeted pseudouridylation 


John Karijolich' & Yi-Tao Yu! 


All three translation termination codons, or nonsense codons, contain 
a uridine residue at the first position of the codon’. Here, we dem- 
onstrate that pseudouridylation (conversion of uridine into pseu- 
douridine (VY), ref. 4) of nonsense codons suppresses translation 
termination both in vitro and in vivo. In vivo targeting of nonsense 
codons is accomplished by the expression of an H/ACA RNA capable 
of directing the isomerization of uridine to ‘Y within the nonsense 
codon. Thus, targeted pseudouridylation represents a novel approach 
for promoting nonsense suppression in vivo. Remarkably, we also 
show that pseudouridylated nonsense codons code for amino acids 
with similar properties. Specifically, YAA and AG code for serine 
and threonine, whereas ‘¥YGA codes for tyrosine and phenylalanine, 
thus suggesting a new mode of decoding. Our results also suggest that 
RNA modification, as a naturally occurring mechanism, may offer a 
new way to expand the genetic code. 

, the C5-glycoside isomer of uridine, has many structural and 
biochemical differences from uridine* (Supplementary Fig. 1). Thus, 
it is possible that replacement of the uridine within a nonsense codon 
with Y may affect translation termination. To investigate the possible 
effect of ¥ on translation termination, we developed an in vitro non- 
sense suppression assay (Fig. la). Briefly, we synthesized an artificial 
messenger RNA that encoded a 6X histidine (6His) tag at the amino 
terminus and a Flag tag at the carboxy terminus. In between the 6His 
tag and Flag tag, a pseudouridylated nonsense codon (AA) was 
inserted (Fig. 1a). In addition, two control transcripts were created 
with the same sequence except that the ‘VY of the nonsense codon 
was either changed to uridine (U), thus forming an authentic nonsense 
codon (UAA), or substituted with a cytidine (C), thus eliminating the 
nonsense codon (CAA) (Fig. 1a). Anti-6His immunoblot analysis indi- 
cated that all three RNAs were equally translated in rabbit reticulocyte 
lysate (Fig. 1b, top panel). Remarkably, however, according to the anti- 
Flag blot, the presence of a ‘¥ within the termination codon resulted in 
robust nonsense suppression (Fig. 1b, lower panel). Specifically, the 
‘YAA-containing transcript produced a strong Flag signal which is 
almost comparable to that produced by the CAA-containing transcript 
(Fig. 1b). In contrast, only a background level of Flag was produced 
when the UAA-containing transcript was used (Fig. 1b). Our results 
thus indicate that presence of ‘¥ in a termination codon effectively 
suppresses translation termination in vitro. 

The in vitro results prompted us to investigate whether the pseu- 
douridylation of a termination codon would elicit nonsense suppres- 
sion in vivo. Taking advantage of the CUP1 reporter system®, we 
introduced a premature termination codon (PTC) at the second codon 
of the CUPI gene, thus creating a new CUP1 reporter gene (termed 
cup1-PTC). Cup1p is a copper chelating protein that mediates resist- 
ance to copper sulphate (CuSO,)’. Thus, upon transformation of the 
cup1-PTC plasmid (pcup1-PTC) into a Saccharomyces cerevisiae cup 1- 
A strain, one should be able to measure nonsense suppression by 
plating the cells on selective medium containing CuSO, (Fig. 2A). 

To direct site-specific V formation in vivo, we took advantage of the 
H/ACA ribonucleoprotein (RNP) family. H/ACA RNPs are primarily 
responsible for the post-transcriptional isomerization of uridine to ‘Y 


within RNA (Supplementary Fig. 2)*”. To target the PTC within cup1- 
PTC we derived an H/ACA RNA from SNR81, a naturally occurring 
yeast H/ACA RNA. The newly derived H/ACA RNA, snR81-1C, con- 
tained a guide sequence capable of targeting the PTC within cup1-PTC. 
In addition, we also constructed a control H/ACA RNA, snR81- 
Random, which contained a random guide sequence. 

To ensure that cup 1-PTC was pseudouridylated in response to express- 
ion of snR81-1C, we measured ‘¥ formation within the PTC both in vitro 
and in vivo. To analyse ’ formation in vitro, we monitored, by thin layer 
chromatography (TLC), ‘¥ formation on a 39-nucleotide fragment of 
RNA corresponding to the region of cup1-PTC containing the PTC 
(Fig. 2B). Incubation of the transcript in extracts prepared from cells 
expressing smR81-1C resulted in the formation of ‘¥ (Fig. 2B, lane 5), 
whereas extracts containing an empty vector or snR81-Random did not 
result in the formation of ‘Y (lanes 3 and 4). These results indicate that 
snR81-1C is capable of directing pseudouridylation within the RNA 
fragment, most probably at the target site, that is, the uridine of the PTC. 

To determine the pseudouridylation status of cup1-PTC in vivo, we 
analysed the PTC of cellularly derived cup1-PTC mRNA using a site- 
specific and quantitative pseudouridylation assay, namely site-specific 
cleavage and radiolabelling followed by nuclease digestion and 
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Figure 1 | Pseudouridylation of a termination codon promotes nonsense 
suppression in vitro. a, Nucleotide sequence of the in vitro transcription 
product and its translated sequence are shown. Positions of the Kozak 
sequence, as well as epitopes (6His and Flag) within the nucleotide and protein 
sequences are labelled. The pseudouridylated nonsense codon is indicated. 
Changes of to U and ¥ to C are also indicated. b, Anti-6His and anti-Flag 
immunoblot analysis of the in vitro translation lysate following translation of an 
RNA lacking a termination codon (CAA), an RNA containing a pseudouridylated 
termination codon (AA), or an RNA containing an authentic termination 
codon (UAA). Relative efficiency of read-through (anti-Flag/anti-6His) was 
calculated and indicated in parentheses (the control, CAA, is set to 100%). Error is 
given as the standard deviation of three independent experiments. 
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Figure 2 | Quantification of cup1-PTC pseudouridylation. A, Schematic of 
the in vivo nonsense suppression assay. B, In vitro pseudouridylation assay by 
thin layer chromatography. 5’ **P-radiolabelled uridylate (pU) and 
pseudouridylate (p‘¥) markers were run in parallel. The substrate—[o*’P]UTP 
uniformly labelled RNA fragment—is shown. C, Quantification of cup1-PTC 
pseudouridylation in vivo. The percentage of pseudouridylation was calculated 
(p'¥/(p¥ + pU)). a, Schematic; b, Spike, c, snR81-Random; d, snR81-1C. 
Adenosine 5'-monophosphate (pA), cytidine 5’-monophosphate (pC), 
guanosine 5’-monophosphate (pG), uridine 5’-monophosphate (pU), and 
pseudouridine 5’-monophosphate (pV), are indicated. ¢, origin. Error is given 
as the standard deviation of three independent experiments. 


two-dimensional TLC (2D-TLC)"*. To help locate the uridine and ‘P 
spots on the TLC plate, we spiked each reaction with **P-radiolabelled 
adenosine 5’-monophosphate (pA), cytidine 5’-monophosphate (pC) 
and guanosine 5’-monophosphate (pG) (Fig. 2C, panel b). Consistent 
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with the results of our in vitro analysis (Fig. 2B), only CUP1 mRNA 
isolated from cells expressing snR81-1C produced a spot correspond- 
ing to W (Fig. 2C, compare panel c with panel d). Quantification 
indicated that approximately 5% of the cup1-PTC transcript was pseu- 
douridylated. Thus, our in vivo pseudouridylation results would pre- 
dict that, upon expression of snR81-1C, a functional Cup1p (although 
in small amount) would be translated from the cup1-PTC mRNA. 

Figure 3a shows the results of the in vivo nonsense suppression assay 
(see Fig. 2a for illustration). As expected, when transformed with wild- 
type pCUPI1, cup1A cells grew healthily on media containing 0 mM or 
0.02 mM CuSO, (top row). However, when transformed with pcup1- 
PTC, only cells co-transformed with psnR81-1C were able to survive 
on medium containing 0.02mM CuSO, (compare the middle row 
with the bottom row). Partial restoration of growth seems to be con- 
sistent with the previous quantitative analysis demonstrating a low 
level of pseudouridylation (~5%) at the PTC (Fig. 2c). 

Because nonsense suppression could be achieved both at the level of 
translation (termination suppression) and the level of mRNA decay 
where PTC-containing mRNA is usually the target of NMD (non- 
sense-mediated decay), it remained possible that the suppression we 
observed in Fig. 3a was a result of NMD suppression rather than 
suppression of translation termination. To test this possibility, we 
measured the levels of CUP! mRNA using northern blot analysis 
(Fig. 3b). As expected, steady-state levels of cup1-PTC mRNA dropped 
significantly when compared with the level of wild-type CUP1 mRNA 
(compare lanes 6 and 7 with lane 5). However, expression of the PTC- 
specific guide RNA (snR81-1C) had no effect on steady-state levels of 
cup1-PTC mRNA; nearly identical levels of cup1-PTC mRNA were 
detected in cells transformed with either psnR81-1C or with empty 
vector (compare lane 6 with lane 7). These results indicated that the 
observed suppression was a result of nonsense codon suppression 
rather than a result of suppressing NMD. To completely eliminate 
the potential complications of NMD, we deleted UPF1 (also known 
as NAM7), a gene required for NMD”, and then repeated the nonsense 
suppression assay and northern blotting. As expected, deletion of 
UPF1 resulted in the stabilization of cup1-PTC (Fig. 3d). Consistent 
with previously published results’*"*, we also observed a small degree 
of nonsense suppression as evidenced by a low (but above background) 
level of growth on medium containing a low concentration (0.013 mM) 
of CuSO, (Supplementary Fig. 3, compare row 5 with row 2). However, 
no growth was observed when CuSO, concentration was raised to 
0.02 mM (Fig. 3c, middle row). Apparently, the nonsense suppression 
phenotype conferred by deletion of UPF1 is not sufficient to promote 
growth on 0.02 mM CuSO,. To assess nonsense suppression induced by 
PTC pseudouridylation, we plated cells on media containing either 
0.02 mM or 0.013mM CuSO, (Fig. 3c and Supplementary Fig. 3). 
Under both conditions, expression of snR81-1C provided a growth 
advantage; the level of growth rescue is similar to that observed when 
UPFI was intact (compare Fig. 3c with Fig. 3a, and Supplementary Fig. 
3). These results further support the notion that expression of snR81-1C 
or pseudouridylation of the PTC resulted in suppression of translation 
termination rather than suppression of NMD. Given that the control, 
where smR81-Random was similarly expressed (Fig. 3d), showed no 
suppression (Fig. 3c), our results also indicate that the observed sup- 
pression of translation termination is guide-RNA-specific. 

To determine further whether ‘¥-mediated nonsense suppression 
can be generalized as well as which amino acids are incorporated at 
‘Y-substituted nonsense codons, we took advantage of a plasmid con- 
taining a C-terminally tagged TRM4 gene (also known as NCL1), 
pIRM4-WT (Fig. 4a). Through site-directed mutagenesis the codon 
for phenylalanine at position 602 (F602) of the TRM4 gene was changed 
toa nonsense codon (TAA, TAG or TGA), creating three variants of the 
plasmid (pTRM4-F602X(TAA), pTRM4-F602X(TAG), and pTRM4- 
F602X(TGA)) (Fig. 4a). 

Figure 4b shows the western blot analysis of extracts prepared 
from wild-type cells expressing wild-type TRM4 (nonsense-free) or 
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Figure 3 | Expression of an H/ACA RNA 
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TRM4-F602X(TAA). When cells were transformed with the wild-type 
TRM4 plasmid, a strong protein A-tag signal was detected (Fig. 4b, 
lane 4). However, when cells were transformed with pTRM4- 
F602X(TAA), a protein A signal was only detected when co-trans- 
formed with a TRM4-F602X(TAA)-specific guide RNA, indicating 
nonsense suppression (Fig. 4b, compare lane 5 with lane 6). 

Next, we carried out large-scale immunoprecipitations to purify full- 
length Trm4p produced as a consequence of ‘¥-mediated nonsense 
suppression. Figure 4c shows an example [pTRM4-F602X(TAA)] of 
such experiments. The bands corresponding to full-length Trm4p 
produced as a consequence of ‘¥-mediated nonsense suppression 
were excised and sequenced by liquid chromatography-tandem mass 
spectrometry (LC-MS/MS; Fig. 4c and Supplementary Figs 4-7). 
Remarkably, LC-MS/MS analysis indicated that pseudouridylated 


UAA and UAG (‘PAA and PAG) both directed the incorporation of 
either serine or threonine (Fig. 4d and Supplementary Figs 4-6). 
Taking into account that the third base of a codon is usually non- 
specific (the wobble base), it makes sense that both PAA and PAG 
code for the same amino acids. With respect to targeted pseudouridy- 
lation of UGA (‘YGA), it directed the incorporation of tyrosine and 
phenylalanine (Fig. 4d and Supplementary Fig. 7). As all three termina- 
tion codons directed the incorporation of two amino acids, we quan- 
tified their frequency of incorporation (Supplementary Fig. 8). 
Interestingly, although ‘AA and ‘PAG both code for serine and threo- 
nine, serine is predominantly incorporated at ‘YAG, whereas serine 
and threonine are incorporated at a roughly similar frequency for 
WAA. Furthermore, ‘WGA primarily directs the incorporation of 
tyrosine. 
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Figure 4 | Generalization of ¥-mediated nonsense suppression and 
determination of amino acids coded for by pseudouridylated nonsense 
codons. a, Schematic representation of the constructs used for protein 
purification (also see text). b, Western blot analysis was carried out using 
extracts prepared from wild-type cells transformed with either pTRM4 wild 
type (WT) and a plasmid containing a random guide RNA gene (pRandom 
guide RNA) (lane 4), pTRM4-F602X(TAA) and pRandom guide RNA (lane 5), 
or pI RM4-F602X(TAA) and a plasmid containing a guide RNA gene that 
targets the nonsense codon (UAA 602) of TRM4-F602X(TAA) (lane 6). 


Enolase (Eno1) was probed as a loading control. The normalized levels of 
Trmé4p are indicated in parentheses under each lane. Error is represented as the 
standard deviation from three independent experiments. c, Cell cultures 
described in b were scaled up, and Trm4 proteins were purified and analysed on 
a SDS-PAGE gel (stained with Coomassie blue); lanes correspond to those in 
b. In the control lane (Con), a known amount (6 1g) of purified Trm4p was 
loaded. M, molecular weight marker. d, Identification of amino acids 
incorporated at P-containing termination codons (also see Supplementary 
Figs 4-8). 
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Interestingly, however, given that the anticodons of the transfer 
RNA*" and tRNA'™" families do not look alike (Supplementary Fig. 
9), our experimental data raise an important question: how is the same 
pseudouridylated nonsense codon (for example, ‘AA or ‘¥AG) recog- 
nized by the different families of tRNA? Although it is possible that the 
presence of ‘¥ in mRNA-tRNA duplexes acts to stabilize interactions 
between the mRNA and near- or non-cognate tRNAs'*”’, an alternative 
explanation is that the presence of ‘¥ within the A-site may disorder the 
local RNA (or ribosome) structure, somehow allowing for the binding 
or accommodation of near- or non-cognate tRNAs, possibly through 
altering the hydration state of the nonsense codon”®. It is also possible 
that unique RNA modifications in the anti-codon loop of tRNA“, 
tRNA™", tRNA?»* or tRNA" contribute to the recognition of pseu- 
douridylated nonsense codons, thus allowing them to be decoded. In 
fact, modifications within the anticodon loop of tRNA have previously 
been demonstrated to impact recoding’’. Perhaps more interestingly, 
it has not escaped our attention that the amino acids incorporated at 
each termination codon are biochemically and structurally similar. 
Specifically, serine and threonine, which are coded for by ‘PAA and 
VAG, are both hydroxylated short-chain amino acids. Likewise, tyro- 
sine and phenylalanine, which are coded for by ‘¥GA, both contain an 
aromatic ring. Although the decoding centre is ~75 A away from the 
peptidyl transferase centre’*, whether there is a role for the amino acid 
in the decoding of -containing termination codons is an interesting 
idea that requires further analysis. If true, such a mechanism would 
represent a completely new mode of decoding. It is interesting to note 
that frameshifting at sense codons also shows a strong preference for 
using tRNA** and tRNA™™ (ref. 19). Although detailed mechanisms 
are still unclear, the similarities in using similar polar amino acids 
(serine and threonine) in frameshifting and in decoding of pseudour- 
idylated nonsense codons certainly deserve further attention. 

Our data demonstrate that artificial H/ACA guide RNAs are able to 
direct the pseudouridylation of nonsense codons of mRNA, thus lead- 
ing to nonsense suppression. It should be noted that artificial guide 
RNAs may have an unintended target(s), thus raising concerns about 
substrate specificity. We did, however, realize this concern when 
designing sense-to-nonsense mutations and their corresponding guide 
RNAs, and purposely avoided the sites and their guide sequences that 
could target other endogenous mRNAs. In fact, BLAST search against 
the yeast genome did not generate any other potential targets that 
appear to be suitable substrates for our artificial H/ACA RNAs. 
Thus, it is unlikely that the observed effects are due to the nonspecific 
effect of modifications of unintended off-targets. 

Our RNA-guided modification strategy is of significant clinical inter- 
est, given the current estimates that approximately 33% of genetic dis- 
eases can be attributed to the presence of a PTC”. On the other hand, 
because the artificial guide RNAs are derived from naturally occurring 
H/ACA RNAs (only the short guide sequence is changed), we predict 
that the nonsense codons of some mRNAs are naturally pseudouridy- 
lated by endogenous H/ACA RNAs as long as the guide sequence 
matches the target. Indeed, using computational algorithms to predict 
nonsense codons that may be natural targets of the endogenous H/ACA 
RNP machinery yields a number of potential candidates (Supplemen- 
tary Fig. 10). In addition, our lab has recently demonstrated that an 
exact match between the H/ACA RNA guide sequence and the target 
sequence is not necessary for efficient modification under certain con- 
ditions. In fact, the mismatches are required for inducible pseudour- 
idylation in response to cell stress*’. Thus, there are probably a large 
number of pseudouridylation targets in mRNAs. Whereas some of these 
potential targets are nonsense codons, a majority of them are expected to 
be sense codons. Given our surprising discovery that pseudouridylation 
of nonsense codons converts them into sense codons, it is not imposs- 
ible that pseudouridylation of sense codons will alter their decoding, 
making mRNA pseudouridylation a novel mechanism of RNA editing. 
If this is true, the genetic code would expand considerably. We predict 
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that targeted pseudouridylation of mRNA is a yet-to-be appreciated 
mechanism of generating protein diversity. 


METHODS SUMMARY 


Relevant properties of strains and growth conditions are described in the text. 
Strain construction and additional growth conditions are described in the 
Methods. Standard procedures were used for all protein and RNA analyses and 
are described in the Methods. Mass spectrometry was performed at the University 
of Rochester Medical Center Proteomics Core and is described in the Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Yeast strains, transformation and growth assay. The cup1-A yeast strain was 
kindly provided by C. Guthrie®. The UPF1 locus was deleted from a cup1-A strain 
using a standard protocol as described previously”. For the analysis of CuSO, 
resistance the appropriate plasmids were transformed into either cup1-A or cup1- 
A upf1-A yeast strains as previously described”, except that after heat shock cells 
were precipitated and resuspended in 100 ul of water rather than YPD (yeast 
peptone dextrose). Single colonies were selected and grown to saturation in 
SGal (synthetic galactose) drop-out media, cells were diluted to an 
ODg¢o00 = 0.001 and then a series of fivefold dilutions were spotted on to SGal 
drop-out media, with or without CuSO,. Growth phenotypes were assessed after 
cells were grown for 3-5 days at 30°C. 

Plasmids. The pCUP1 plasmid was a gift from D. Mcpheeters, and pTRM4 WT 
was a gift from E. Phizicky and B. Grayhack. pcup1-PTC and pTRM4 F602X 
variants were created by site-directed mutagenesis using Pfu polymerase 
(Stratagene) and the appropriate oligonucleotides and plasmids. Novel H/ACA 
RNA genes were constructed by PCR using four overlapping oligonucleotide 
primers and were either cloned into 2 tm URA3 or 2 um LEU2 vector (both gifts 
from E. Phizicky) as BamHI/HindIII fragments”. 

In vitro transcription and translation. To generate mRNA transcripts for in 
vitro translation, DNA templates were synthesized through PCR using two over- 
lapping DNA oligonucleotides. The double-stranded DNA templates thus syn- 
thesized contained either a TAA nonsense codon or a CAA codon in the middle, 
flanked by a 6His-coding sequence near the 5’ end and a Flag-coding sequence at 
the 3’ end (Fig. 1b). For efficient in vitro translation, the templates also contained a 
Kozak sequence immediately upstream of the 6His coding sequence (Fig. 1b). In 
addition, a T7 promoter sequence was included at the 5’ terminus. Following in 
vitro T7 transcription™*°, UAA- or CAA-containing mRNA transcripts were 
synthesized (see Fig. 1b). To create a similar mRNA, with the uridine of the 
nonsense codon (or the cytidine of the CAA codon) changed to V, a two-piece 
splint ligation was employed**. The 5’ RNA was in-vitro-synthesized through T7 
transcription, ending with CUC at its 3’ terminus (see Fig. 1b), and the 3’ piece 
(5'-GAGPAAGACUACAAGGACGACGACGACAAGAUCUAG-3’) (see Fig. 1a) 
was chemically synthesized (Thermo Scientific). The 5’ and 3’ halves were ligated 
using a bridging oligonucleotide and T4 DNA ligase”. In-vitro-synthesized RNAs 
were gel-purified before being used in the in vitro translation reactions. In vitro 
translation reactions were carried out in 30 ul reactions of Red Nova Lysate as 
described by the supplier (Novagen). PCR of two overlapping oligodeoxynucleo- 
tides was also used to generate the template for in vitro transcription of the sub- 
strate used in the in vitro pseudouridylation assay (Fig. 2C). 

Northern blot analysis. Total RNA was isolated from yeast using TRIzol essen- 
tially as described by the supplier (Invitrogen), except that cells were vortexed with 
acid-sterilized glass beads for 5 min. For northern blot analysis, 6 j1g of total RNA 
was separated on 8% polyacrylamide-7 M urea gels and electrotransferred at 4 °C 
to Amersham Hybond-N* membranes in 0.5X TBE buffer for 16h at 15V. 
Hybridizations were preformed essentially as described”’. 

Protein purification and immunoblot analysis. For the analysis of Trm4p protein 
sequence, BY4741 was transformed with the appropriate plasmids as described 
before’. Yeast whole-cell extracts and IgG Sepharose chromatography was pre- 
formed as previously described’*. For analysis of in vitro translation products, 
membranes were probed with either a monoclonal Flag antibody (M2; Sigma- 
Aldrich) or monoclonal His-probe (H-3; Santa Cruz Biotechnology). Goat 
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anti-mouse IgG (H+L)-alkaline phosphatase (AP) conjugate (Bio-Rad) was then 
used as a secondary antibody. Proteins were visualized using 1-Step NBT/BCIP 
(Pierce). For the analysis of Trm4p, yeast crude extracts were separated on 4-15% 
Tris-HCl Ready gels (BioRad). Proteins were then transferred to 0.1 jim nitrocellu- 
lose membranes (Whatman) and probed with either monoclonal Protein A (Sigma- 
Aldrich) or anti-Enolp (a gift from M. Dumont). Goat anti- Mouse IgG (H+L)- 
AP conjugate (Bio-Rad) was used as a secondary antibody. Proteins were visua- 
lized using 1-Step NBT/BCIP (Pierce). 

Pseudouridylation assays. In vitro pseudouridylation assays were performed 
using yeast whole-cell extract. Cells were grown to mid log phase and pelleted. 
Pellets were resuspended in 200 ll of extraction buffer containing 20 mM HEPES 
at pH7.9, 0.42M NaCl, 1.5mM MgCh, 0.2mM EDTA, 0.5mM DTT, 0.5mM 
PMSF, and 25% glycerol. Sterile acid-washed glass beads (400 ill) were added to 
the cell suspension, and cells were subsequently homogenized through vigorous 
vortexing (5 X 30s) at 4 °C. Following a 5-min centrifugation (14,000g, 4 °C), the 
supernatant was recovered, and used for the pseudouridylation assay. The sub- 
strate RNA was prepared by in vitro transcription in the presence of [o-P**JUTP. 
The substrate was gel purified and incubated in the extracts for 2h. The radio- 
labelled substrate was recovered and digested with nuclease P1 and analysed by 
one-dimensional-TLC as previously described’®. Pseudouridylation of cellularly 
derived cup1-PTC RNA was analysed as previously described”®, except that modi- 
fications were analysed by 2D-TLC”. 

Mass spectrometry. Mass spectrometry was preformed at the University of 
Rochester Proteomics Center. Coomassie-stained gel bands corresponding to 
full-length Trm4p were subjected to in gel trypsin digestion. An 80-min LC- 
MS/MS run was performed in-line with a Finnigan LTQ Ion Trap mass spectro- 
meter (Thermo Scientific), using a flow rate of 250 ul min |. The data collected 
from the LTQ runs was searched using MASCOT (Matrix Science), initially 
against the full Saccharomyces database, second against a custom database which 
included the wild type Trm4p sequence, as well as a Trm4p sequence with an “X” 
in the amino acid position that corresponds to the stop codon. Peptides identified 
by Mascot with an ion score of 15 or greater were inspected further for MS/MS 
fragmentation patterns that map through most of the peptide sequence, especially 
on and through the mutant amino acid position. Peptides with Expect values 
greater than 0.05 were not accepted. To allow for relative quantification, we 
repeated the LC-MS/MS experiments using dynamic inclusion for only the pep- 
tides of interest. Total spectral counts obtained by dynamic inclusion therefore 
represent the relative abundances of each respective peptide. 
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Close quarters 


Cramped living conditions, unruly colleagues or crowded 
schedules can be challenging for the most intrepid scientist. 


BY LUCAS LAURSEN 


Hespérides one evening last March, Jordi 

Dachs points at the schedule for the next 
day’s oceanographic observations. The Spanish 
research vessel is chugging across the Indian 
Ocean at a speed of about ten knots. “The 
storm has put us seven hours behind,’ warns 
Dachs, an environmental chemist at the Insti- 
tute of Environmental Assessment and Water 
Research in Barcelona, Spain, whose respon- 
sibilities as chief scientist on the ship include 


L: the scientists’ lounge aboard the BIO 


planning researchers’ time and instrument use. 
About two dozen scientists brace themselves 
against the rhythmic pitching of the vessel. “We 
might not lower the sampling rosette all the way 
on some days,” says Dachs, “to save time.” 

His suggestion fills the room with tension. 
Lowering and retrieving the rosette can take 
many hours, but the water samples it retrieves 
from the ocean's depths — as much as 4 kilo- 
metres down — hold the biggest potential for 
new discoveries. Sampling excursions in the 
Indian Ocean's deep waters are relatively rare, 
making the samples particularly valuable. 
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“Isn't this a deep-sea campaign?” snaps 
Celia Marrasé, a plankton ecologist at the 
Institute of Marine Sciences in Barcelona. 
Marrasé studies how dissolved and par- 
ticulate organic matter sequesters carbon in 
the sea: skipping the deepest waters would 
affect her results. After more heated discus- 
sion, Dachs agrees to consider saving a few 
minutes on each of the 12 remaining days at 
sea by cutting time for all the observations, 
not just those from the rosette. The disgrun- 
tled researchers scatter, but cannot go very 
far: their bunk rooms open directly onto the 
scientists’ lounge. 

In such a tight space, there is neither the 
room, the time nor the money to let differ- 
ences escalate. “It’s very expensive to send 
people to these places and you want to get a 
lot of return,’ says Albert Harrison, a retired 
psychologist from the University of Califor- 
nia, Davis, who has studied the social factors 
that lead to success on space missions. For 
decades, NASA, the Russian Federal Space 
Agency and researchers in psychology and 
anthropology have examined how to achieve 
productivity in remote settings, gleaning les- 
sons that are useful for any scientist who con- 
ducts fieldwork in close quarters. 

Astronauts, polar biologists, desert geolo- 
gists and ocean chemists all face similar chal- 
lenges in their working conditions: relative 
isolation; overexposure to their colleagues; 
pressure to accomplish a lot of work in a 
short time with limited technical resources; 
and few options for escaping to family or 
friends outside work. If managed poorly, 
such circumstances can degrade a team’s 
cohesion, health and productivity. Scientists’ 
experiences, and the work of psychologists 
such as Harrison, suggest that researchers 
designing remote expeditions need to culti- 
vate a sense of mission to guard against tough 
conditions, plan ahead with suitable sched- 
ules and equipment, practise working in field 
conditions and, above all, exercise patience 
when colleagues, instruments and experi- 
ences don't live up to their initial hopes. 


FORWARD PLANNING 

In the early years of human space flight, 
mission planners had huge expectations 
for how the astronauts would spend their 
precious days in space. “Historically there 
was a tendency to over-programme,’ says 
Harrison. That was exacerbated by the enor- 
mous cost and limited flight time of each 
expedition. > 
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>  Onasmaller scale, the same thing often 
happens in fieldwork on Earth. When Tim 
Wright, a geophysicist at the University of 
Leeds, UK, led his first field trip to the Afar 
region of Ethiopia, he had a long list of multi- 
disciplinary observations that he hoped to 
make, from collecting rocks to mapping fault 
structures. “We go in with these incredibly 
detailed plans: day one, go here; day two, go 
here,’ says Wright. By the second day, “it’s usu- 
ally ripped up and used to start the fire, because 
things take way longer than you anticipate”. 

Trying too hard to achieve too much can 
harm the scientists and their data. Alberto 
Escribano, second in command on the 
Hespérides, is in the Spanish navy, and has first- 
hand experience of the consequences of teams 
pushing themselves too far. He explains that 
Spanish navy crews typically alternate 6-hour 
work and rest shifts on combat ships, but if 
they maintain this schedule for much longer 
than two weeks, the crew members become 
exhausted — and havoc can ensue. Sailors who 
are overtired don’t cope well with tasks that 
demand close attention, such as keeping the 
ship steady while instruments hang overboard. 
The same effect can wear down researchers, 
leading to squashed fingers, broken sampling 
bottles and fewer or lower-quality data. The 
solution for longer voyages, such as seven- 
month cruises on the Hespérides, is for each 
crew member to take shorter watches with 
longer recovery times. 

“You want to avoid cumulative fatigue,’ says 
Harrison. “Rest periods are not just perks — 
they help workers maintain their strength and 
high level of productivity.” 

Conflicting research goals can overload 
an expedition’s agenda and put researchers 
under stress. The Hespérides expedition led 
by Dachs is just one leg of a Spanish National 
Research Council (CSIC) circumnavigation of 
the Indian Ocean that is trying to balance the 
research objectives of six disciplines, includ- 
ing biogeochemistry, microbiology and 
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environmental chemistry. The ship has to be 
stationary for researchers to use some instru- 
ments, such as the sampling rosette, but has 
to be moving for others, such as a pollution- 
sensing torpedo. Still other observations can be 
done only one at a time, to prevent cables from 
getting tangled. The resulting Byzantine sched- 
ule has, at times, left some researchers on the 
Hespérides working around the clock, sleeping 
only in short stretches, and others keeping a 
more comfortable 8 a.m.-3 p.m. schedule. 


PRACTICE MAKES PERFECT 

The success of an expedition can hinge on 
compatibility — or lack thereof — between 
team members, so it is important to choose 
companions carefully, if possible. Wright says 
that on his field trips in Afar, the most laid- 
back people cope best with the conditions, and 
he takes this into account when selecting grad- 
uate students and hiring new members of the 
team. He counsels postdocs and investigators 
to discuss such considerations before deciding 
which graduate students or colleagues to invite 
on their own expeditions. 

Researchers who aren't at liberty to select 
their field companions should consider arrang- 
ing a shorter, simpler ‘test’ trip ahead of time, 
so people can learn to work with each other 
before the big commitment. Gloria Leon, a 
psychologist at the University of Minnesota 
in Minneapolis, says that space agencies often 
test teams of astronauts for compatibility by 
taking them on short field excursions or even 
road trips, exposing them to the tight spaces, 
tough field conditions and even body odour of 
areal adventure. Escribano endorses this kind 
of advance testing. “If I had to screen people 
for this, I'd lock them in a room and shake it 
around for a while,’ he says. He recommends 
watching for stress-induced irritability or self- 
imposed isolation, and offering team members 
extra attention to help them deal with the stress, 
if they need it (see “Tips for far-off fieldwork). 

Even if a research group doesn't have the 
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BEFORE YOU GO 
Tips for far-off fieldwork 


A few precautions will help to promote 
an efficient, productive and comfortable 
research expedition. 


@ Cultivate a higher sense of purpose 

— you're learning how plankton trap 
carbon, not just filling dozens of bottles 
with sea water. 

@ Make your quarters habitable. People 
on different shifts shouldn’t have to 
share the same bunk. 

@ Test your equipment. The field is nota 
good place to learn how it works. 

@ Test your team. Get to know how they 
will respond to field conditions, and head 
off any problems that arise. 

@ Remember to rest your body. In the 
long run, your work will be better. 

@ Build downtime into the schedule; if 
delays don’t use it up, new opportunities 
will. For example, after the meltdown 

at the Fukushima Daiichi nuclear plant 
in Japan in March, the scientists on the 
Spanish research vessel B/O Hespérides 
began to take previously unplanned 
measurements of radiation levels. 

@ Promote open communication to let 
team members brainstorm or just blow 
off steam. L.L. 


luxury of running a trial trip, members can still 
benefit from working together in advance and 
trying to sort out any interpersonal difficul- 
ties, says Leon. The sole female member of one 
group of polar explorers that Leon studied com- 
plained that the men in the team competed with 
each another and all vented to her, giving her 
a disproportionate and unexpected emotional 
burden. Had she been able to anticipate that 
dynamic, the explorer might have had a chance 
to address her companions’ behaviour before- 
hand and avoid undue strain during the trip. 


KEEP YOUR PATIENCE 
Once a researcher is out in the field, he or she 
should recognize that not everything is con- 
trollable. For example, on one occasion, the 
Hespérides experienced four straight days of 
rough seas south of Madagascar, and Dachs’s 
team couldn't lower the sampling rosette at all. 
Such things happen on all expeditions. “Some- 
times you just can’t collect data,’ says Wright. 
“There's no point in getting stressed. That isn’t 
going to make anything happen any faster’ 
Focusing on a long-term career or scien- 
tific goal can help researchers to cope with the 
stress of fieldwork, says Leon. She has learnt 
that the people who succeed on expeditions to 
the poles — whether they are scientists, sports 
enthusiasts or explorers — are able to tolerate 
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Crew members have to cooperate to overcome difficulties on long voyages. 


the conditions because they are trying to 
achieve something important to them, suchas 
being the first people to ski to the North Pole. 
Similarly, scientists should keep in mind the 
scientific problems they will be able to solve, 
or the papers they will publish, when they're 
staying up all night to monitor instruments or 
bathing from a bucket in the desert. 

Shrugging off the frustration is easier for 
old hands than for expedition novices such 
as Patricia Puerta, a marine-biology graduate 
student at the CSIC’s Mediterranean Institute 
for Advanced Studies in Majorca, Spain, who 
is staking her PhD on plankton data collected 
on the Hespérides. On Puerta’s first cruise, 
loose equipment on deck broke her incubation 
tanks during storms. “My stuff was cannon 
fodder,’ she recalls, and she worried that she 
wasnt making enough progress on her PhD. 
By the second leg, Puerta had found better 
ways of rigging her incubation tanks to save 
them from damage. She also learned the value 
of patience, she says. “Delays still bother me, 
but now I’m more accepting.” Puerta realized 
that not every sample was crucial. 

Part of becoming an accomplished field- 
worker is learning to delegate responsibilities 
such as planning the logistics of an expedi- 
tion, and to rely on experts to handle things 
that the researchers can't do as effectively 
themselves, says Wright. In the Afar, that has 
meant trusting local scientists or technical 
staff to help install instruments, negotiate 
maintenance with A faris and rent camels for 
transport. “We couldnt do any of this without 
our local colleagues,” says Wright. “You can't 
have your eye on all of the balls that are in the 
air.” For Puerta, that meant accepting prac- 
tical assistance: when her fraction collector 
broke, preventing her from taking automatic 
water samples overnight, she didn't, in this 
instance, have the technical skills to fix it, 
and had to trust that someone else on board 
did. “It can make people feel helpless,” says 
Escribano, who, it turned out, travels with 


a silver briefcase of electronics equipment 
and was able to diagnose and repair Puerta’s 
instrument. 

Seasoned field scientists must also learn 
to deal with extended stretches of discom- 
fort and boredom. Breaking the routine on 
the ship can help to distract people from the 
discomforts, says Escribano. He has used cos- 
tume contests and card games to entertain 
the research team. Others have different ways 
of dealing with the relentless presence of oth- 
ers, says Leon; she recalls a polar explorer 
who would “dig real deep in her sleeping bag 
at night to cover her head” to secure some 
respite from her companions. 

Technology can also mitigate the monotony 
of fieldwork. Satellites allow team members 
to ease homesickness by contacting family 
and friends by phone and through e-mail 
and social-networking sites, even if only 
for limited amounts of time. Communica- 
tions technology can also facilitate scientific 
decision-making, says Dachs. “On a cruise, 
you have to react fast, and we use e-mail to 
consult colleagues on land when we have 
a problem or a question,’ he says. From the 
Indian Ocean, he was able to talk to the other 
senior expedition planners at research insti- 
tutes throughout Spain and work out which 
observations to trim. Not everybody was 
happy, but everybody was in the loop, and in 
the end they didnt have to cut quite as many 
observations as Dachs had feared. 

Dachs has one more trick up his sleeve to 
boost morale: creative time-keeping. The day 
before arriving in Perth, Australia, for a short 
breakin the cruise, the Hespérides approached 
its offshore meeting point — where a local 
harbour pilot helps guide the ship into port 
— earlier than announced. “When you make 
the cruise schedule,’ says Dachs, “you always 
underestimate the ship's speed.” = 


Lucas Laursen is a freelance journalist 
based in Zurich, Switzerland. 
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UNITED KINGDOM 


Postdocs with promise 


The University of Birmingham, UK, has 
launched a global search for 50 postdocs, 
mainly in science, engineering and 
maths. The university is offering five- 
year fellowships leading to permanent 
academic posts, providing the stability 
sought by many young researchers. It is 
initially investing between £3.5 million 
(US$5.7 million) and £5 million in the 
scheme. Applicants must have achieved a 
PhD in the past seven years; postdocs will 
earn about £37,000 annually. “Our vision 
is to reinvigorate research while creating a 
cohort of outstanding people that identify 
with this university,’ says Adam Tickell, 
pro-vice-chancellor for research. Tickell 
adds that if the programme receives many 
qualified applicants, funding will be 
available to hire more than the initial 50. 
The first round of applications closes 

on 1 September. 


GERMANY 
Research centres open 


Twenty-one collaborative research centres 
in Germany will create up to 630 jobs 

for postdocs and PhD candidates. The 
German Research Foundation (DFG), 

the country’s main granting agency, is 
spending €197 million (US$289 million) 
on the centres, which will focus mainly on 
biological and physical sciences. Research 
programmes will range from sensory 
processing to star formation. Opening 

on 1 July at various host universities, the 
institutions bring the total number of 
DFG-funded research centres to 250. The 
agency will fund the centres for up to 

12 years, with about 25 contract scientists 
each, mostly PhD candidates and the rest 
postdocs. There are no restrictions on the 
nationalities of applicants. 


US POSTDOCS 


Women feel isolated 


Focus groups convened by the US 

National Postdoctoral Association (NPA) 
in Washington DC say that women’s 
academic careers in the United States are 
hindered by factors including isolation, 
lack of confidence, perceived lack of status 
and unfriendly family policies. The groups 
are part ofa National Science Foundation- 
funded project to identify best practices for 
advancing female postdocs’ careers. Cathee 
Johnson Phillips, NPA executive director, 
says that many women in all disciplines 
reported inequality and isolation. The NPA 
will publish its findings in 2012. 
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NMDA receptor blockade at rest triggers rapid 
behavioural antidepressant responses 


Anita E. Autry', Megumi Adachi', Elena Nosyreva’, Elisa S. Na’, Maarten F. Los', Peng-fei Cheng’, Ege T. Kavalali* 


& Lisa M. Monteggia’ 


Clinical studies consistently demonstrate that a single sub-psycho- 
mimetic dose of ketamine, an ionotropic glutamatergic NUDAR 
(N-methyl-D-aspartate receptor) antagonist, produces fast-acting 
antidepressant responses in patients suffering from major depres- 
sive disorder, although the underlying mechanism is unclear’. 
Depressed patients report the alleviation of major depressive dis- 
order symptoms within two hours of a single, low-dose intravenous 
infusion of ketamine, with effects lasting up to two weeks'°, unlike 
traditional antidepressants (serotonin re-uptake inhibitors), which 
take weeks to reach efficacy. This delay is a major drawback to 
current therapies for major depressive disorder and faster-acting 
antidepressants are needed, particularly for suicide-risk patients’. 
The ability of ketamine to produce rapidly acting, long-lasting anti- 
depressant responses in depressed patients provides a unique 
opportunity to investigate underlying cellular mechanisms. Here 
we show that ketamine and other NMDAR antagonists produce 
fast-acting behavioural antidepressant-like effects in mouse models, 
and that these effects depend on the rapid synthesis of brain-derived 
neurotrophic factor. We find that the ketamine-mediated blockade 
of NMDARat rest deactivates eukaryotic elongation factor 2 (eEF2) 
kinase (also called CaMKIII), resulting in reduced eEF2 phosphor- 
ylation and de-suppression of translation of brain-derived neuro- 
trophic factor. Furthermore, we find that inhibitors of eEF2 kinase 
induce fast-acting behavioural antidepressant-like effects. Our find- 
ings indicate that the regulation of protein synthesis by spontan- 
eous neurotransmission may serve as a viable therapeutic target for 
the development of fast-acting antidepressants. 

We examined the acute effect of ketamine in wild-type C57BL/6 
mice and detected notable behavioural responses in antidepressant- 
predictive tasks, including the forced swim test (FST), novelty- 
suppressed feeding (NSF) and learned helplessness (Supplementary 
Figs la-e and 2a-c)*. Ketamine also produced such responses in a 
sucrose consumption test, as well as in NSF and FST, after chronic 
mild stress, an animal model of depression (Supplementary Fig. 1f-i). 
To elucidate the mechanisms underlying the fast-acting antidepressant 
action of ketamine, we focused on FST, a test that is predictive of non- 
monoaminergic antidepressant efficacy’, We examined the time 
course of behavioural antidepressant effects in wild-type mice after a 
single, low-dose treatment with ketamine, (5S,10R)-(+)-5-methyl- 
10,11-dihydro -5H-dibenzo(a,d)cyclohepten-5,10-imine maleate (MK- 
801) or 3-((R)-2-carboxypiperazin-4-yl)-prop-2-enyl-1-phosphonic 
acid (CPP) (Fig. la-c). After either 30min or 3h, each NMUDAR 
antagonist markedly reduced the immobility of mice in FST, when 
compared to vehicle-treated animals, indicating that NMDAR 
blockade produces fast-acting antidepressant responses. Notably, in 
our system, acute treatment with conventional antidepressants did 
not produce antidepressant-like FST responses (Supplementary Fig. 
3), which may require multiple doses°. The effects of ketamine and 
CPP, but not of MK-801, persisted for 24h (ref. 4) and ketamine’s 
behavioural effect lasted for 1 week. Acute NMDAR-antagonist 
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Figure 1 | Time course of NNDAR antagonist-mediated antidepressant- 
like behavioural effects. Mean immobility + s.e.m. of C57BL/6 mice in FST 
after acute treatment with ketamine, CPP or MK-801. Independent groups of 
mice were used at each time point and for each drug treatment, to avoid 
behavioural habituation. Analysis of variance (ANOVA) F327) = 30.31, 
P<0.0001 for treatment groups; F(3,27) = 19.06, P< 0.0001 for duration of 
response; F(o,g1) = 9.32, P< 0.0001 for treatment-duration interaction. 
Therefore, we examined treatment effects by time point. a, Ketamine 

(3.0 mgkg ') significantly reduced immobility, indicating an antidepressant- 
like response, at 30 min, 3h, 24h and 1 week, compared to vehicle treatment. 
b, CPP (0.5 mgkg ') significantly reduced immobility at 30 min, 3h and 24h, 
compared to vehicle treatment. c, MK-801 (0.1 mgkg_ ') produced significant 
decreases in immobility at 30 min and 3 h compared to vehicle treatment. 
n= 10 mice per group per time point; *, P< 0.05. Here and in all figures, error 
bars represent s.e.m. 
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treatment produced no alterations in hippocampal-dependent learn- 
ing (Supplementary Fig. 1d) or locomotor activity (Supplementary Fig. 
4). These drugs have short half-lives (about 2-3 h)**, indicating that 
sustained NMDAR-antagonist-induced antidepressant responses are 
due to synaptic plasticity, not to persistent blockade of receptors. 

Brain-derived neurotrophic factor (BDNF) is linked to traditional 
antidepressant action; BDNF expression in the hippocampus is 
increased by antidepressants’ and BDNF deletion in the hippocampus 
attenuates antidepressant behavioural responses'®*. Moreover, intra- 
ventricular or intrahippocampal BDNF infusion causes rapid, sustained 
antidepressant-like effects, lasting 3-6 days in FST'**. To examine 
whether the antidepressant-like response to ketamine is mediated 
through BDNF, we administered ketamine to inducible Bdnf-knockout 
mice’, then observed FST behaviour. After 30 min, ketamine-treated 
wild-type littermate controls showed significant reductions in immobility, 
indicating antidepressant-like responses, when compared to vehicle- 
treated controls (Fig. 2a). However, ketamine did not produce anti- 
depressant-like effects in Bdnf-knockouts, indicating that fast-acting 
antidepressant responses require BDNF. After 24h, ketamine signifi- 
cantly reduced immobility in controls, but not in Bdnf knockouts 
(Fig. 2a), indicating that ketamine’s sustained effects depend on 
BDNF. To validate this link between NMDAR antagonists and 
BDNF-mediated antidepressant responses, MK-801 was administered 
to Bdnf knockouts or controls. After 30 min, MK-801 significantly 
reduced FST immobility in controls, but had no effect in Bdnf knock- 
outs (Supplementary Fig. 6). MK-801 did not affect FST behaviour 
after 24h (Supplementary Fig. 6), as previously demonstrated 
(Fig. 1c). We next generated postnatal conditional’’ knockouts in neu- 
rotrophic tyrosine kinase receptor, type 2 (Ntrk2, also called TrkB) and 
found that these mice were insensitive to ketamine’s antidepressant- 
like effects in FST and NSF (Supplementary Fig. 5a, b). To confirm 
TrkB engagement, we examined receptor autophosphorylation and 
found increased TrkB activation after NMDAR antagonist treatment 
(Supplementary Fig. 5c). 

To determine whether NMDAR antagonists alter Bdnf expression 
in the hippocampus, wild-type mice were treated acutely with vehicle, 
ketamine or MK-801. Quantitative RT-PCR analysis of the coding 
exon of Bdnf showed that Bdnf mRNA expression was unaltered by 
ketamine or MK-801 at either 30 min or 24h after treatment (Sup- 
plementary Fig. 7a). Contrastingly, western blot and ELISA analyses 
showed that BDNF protein levels were markedly increased at 30 min, 
but not at 24h, after NMDAR antagonist treatment (Fig. 2b and 
Supplementary Fig. 7b). Moreover, the acute effects of ketamine on 
BDNF extended to its precursor, proBDNF (Supplementary Fig. 7c). 
These data indicate that rapid increases in BDNF protein translation, 
not transcription, are necessary for fast-onset antidepressant res- 
ponses. However, continued BDNF protein upregulation does not 
underlie ketamine’s long-term behavioural effects. 

To study further the roles of translation and transcription in keta- 
mine’s antidepressant-like effects, we examined FST behaviour in mice 
treated with the protein synthesis inhibitor anisomycin’® or with the 
RNA polymerase inhibitor actinomycin D””, which block their respective 
processes by about 80% within 2 h. We pretreated mice with anisomycin 
or actinomycin D before treating them with ketamine (Fig. 2c). 
Anisomycin prevented the ketamine-induced rapid behavioural res- 
ponses seen at 30 min in FST and NSF paradigms, indicating a depend- 
ence on new protein synthesis (Fig. 2d and Supplementary Fig. 8a, b). 
Anisomycin also prevented ketamine’s long-term effect on FST (24h), 
indicating that rapid protein translation was involved in sustained 
antidepressant-like responses (Fig. 2e). We found that the synthesis of 
both mature BDNF and proBDNF in the hippocampus was sensitive to 
anisomycin treatment (Supplementary Fig. 8c, d). However, actinomy- 
cin D did not affect ketamine’s antidepressant-like effect on FST at 
either time point, indicating that it is independent of new gene expres- 
sion (Supplementary Fig. 9b, c). To confirm that actinomycin D 
crossed the blood-brain barrier, we examined Bdnf mRNA expression 
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Figure 2 | BDNF translation in the antidepressant effects of NUDAR 
antagonists. a, Immobility in FST after acute treatment with ketamine 
(3.0mgkg '). At 30min, ANOVA Fy35) = 17.13, P = 0.0002 for drug; 

F.,,35) = 7.57, P = 0.0093 for genotype—drug interaction; multiple comparisons 
with t-test, *, P<0.05. At 24h, in a separate cohort, ANOVA Fi 29) = 3.77, 
P= 0.0619 for treatment; multiple comparisons with t-test, *, P< 0.05. 

n= 7-12 mice per group. b, Densitometric analysis of BDNF (normalized to 
GAPDH) in the hippocampus after treatment with vehicle (control), ketamine 
(3.0mgkg ') or MK-801 (0.1 mgkg '). At 30 min, ANOVA F(,12) = 6.77, 
P= 0.0108 for treatment, Bonferroni post hoc test, *, P< 0.05. At 24h, no 
significant differences were seen (n = 5-6 per group). ¢, Protocol for 
experiments using the blockers anisomycin and actinomycin D. d, Immobility 
at 30 min after anisomycin treatment. ANOVA Fi, 34) = 11.83, P = 0.0016 for 
treatment and Fi; 34) = 10.91, P = 0.0023 for treatment-inhibitor interaction; 
multiple comparisons, *, P< 0.05 (n = 8-10 per group). e, Immobility at 24h 
after anisomycin treatment. ANOVA F(1,31) = 9.34, P = 0.0046 for treatment; 
multiple comparisons, *, P< 0.05 (n = 8-10 per group). f, Immobility of wild- 
type mice given vehicle or NMDA (75 mgkg '), tested 30 min later in FST. 
g, Immobility of mice given NBQX (10 mgkg ') or picrotoxin (1 mgkg '), 
tested 30 min later in FST. h, Immobility of mice given vehicle, ketamine 
(3.0mgkg *) or ketamine + NBQX (10 mgkg ') and tested 30 min later in 
FST. ANOVA F9,.6) = 8.226, P < 0.0019; Bonferroni post hoc analysis shows 
that the ketamine effect is reversed by NBQX, *, P< 0.05. i, Densitometric 
analysis of phosphorylated mTOR (normalized to mTOR) in the hippocampus 
30 min after treatment with vehicle or ketamine. 
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in drug-treated animals and found decreased Bdnf transcription in the 
hippocampus (Supplementary Fig. 9a). Taken together, these findings 
indicate that rapid, transient translation of BDNF is required for keta- 
mine’s fast-acting and long-lasting antidepressant-like behavioural 
effects and that long-term antidepressant responses may be due to 
alterations in synaptic plasticity, initiated by transient increases in 
BDNF translation. 

We observed increased levels of BDNF protein in the cortex, but not 
in the nucleus accumbens, 30 min after acute administration of keta- 
mine or MK-801 (Supplementary Fig. 10a, b). We further investigated 
whether NMDAR antagonism affected proteins other than BDNF. We 
found an increased level of activity-regulated cytoskeletal-associated 
protein (ARC) in the hippocampus (sensitive to anisomycin treatment; 
Supplementary Fig. 8e) but there was no increase in HOMER or 
GRIA1 (also known as GLUR1), nor in the phosphorylation of ribo- 
somal protein s6 kinase (Supplementary Fig. 10c-f). Additionally, 
these proteins remained unaltered in the cortex after acute treatment 
with NMDAR antagonists (Supplementary Fig. 11a-e). 

Synaptic plasticity and ensuing learning processes are often 
mediated by NMDAR-activation-driven protein translation, but 
antidepressant-like effects require protein translation induced by 
NMDAR blockade. To resolve this paradox, we turned to recent evidence 
that NMDAR blockade by MK-801 or 2-amino-5-phosphonopentanoic 
acid (AP5) without neuronal activity, augments protein synthesis 
through eEF2 dephosphorylation (activation). eEF2 is a critical catalytic 
factor for ribosomal translocation during protein synthesis'*. In this 
model, resting NMDAR activity causes sustained activation of eEF2 
kinase (eEF2K, or CamKIII), which phosphorylates eEF2, effectively 
halting translation, whereas acute NMDAR blockade at rest (in the 
absence of action potentials) attenuates eEF2 phosphorylation, allowing 
the translation of target transcripts. 

To evaluate this model, we tested whether excess synaptic glutamate, 
possibly elicited by NMDAR blockade, was responsible for the beha- 
vioural effects of ketamine. Acute NMDA administration did not alter 
FST behaviour (Fig. 2f), as previously demonstrated”, but it increased 
ARC expression (Supplementary Fig. 10i), indicating that excess 
glutamate does not elicit rapid behavioural antidepressant effects. To 
define the role of neuronal activity in antidepressant behavioural 
effects, we tested whether NBQX, a blocker of «-amino-3-hydroxy- 
5-methyl-4-isoxazole propionic acid (AMPA) channels that reduces 
neuronal activity, or picrotoxin, a blocker of GABA (y-aminobutyric 
acid) channels that increases activity, affected FST behaviour*”®. Acute 
systemic treatment with these drugs did not affect FST behaviour 
(Fig. 2g) or BDNF synthesis, though picrotoxin enhanced ARC expres- 
sion in the hippocampus (Supplementary Fig. 10g, h). However, 
when co-applied with ketamine, NBQX abolished behavioural 
antidepressant-like responses in FST (Fig. 2h), as previously described’. 
These data indicate that behavioural antidepressant effects are not 
elicited by alterations in evoked neurotransmission, but require 
ketamine-mediated augmentation of AMPA-receptor activation. 

Recent evidence indicates that cortical mTOR signalling underlies 
ketamine-mediated antidepressant responses’. We _ investigated 
whether the rapid behavioural antidepressant effects of ketamine 
required mTOR activation, and whether this signalling was downstream 
of BDNF. Regulation of phosphorylated mTOR was not detected after 
acute administration of ketamine in control or Bdnf-knockout hippo- 
campal tissue (Fig. 2i), nor in wild-type cortical tissue (Supplementary 
Fig. 11d). In earlier work, rapamycin prevented ketamine-mediated 
antidepressant responses; however, the link between rapamycin and 
antidepressant-like effects is equivocal””. We tested whether pre-treat- 
ment with rapamycin could block acute ketamine-mediated FST beha- 
viour. Thirty minutes after ketamine administration, wild-type mice 
showed antidepressant responses unaffected by rapamycin treatment 
(Supplementary Fig. 11h). Rapamycin reduced the phosphorylation of 
ribosomal protein s6 kinase in the cortex and hippocampus (Sup- 
plementary Fig. 11f, g), indicating that the rapamycin had penetrated 
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brain tissue. The earlier study examined molecular effects 2 h after drug 
treatment, or behavioural effects 24h after drug treatment”’; therefore 
mTOR’s role in the antidepressant effect of ketamine may be one of 
maintenance rather than rapid induction. 
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Figure 3 | Ketamine blocks NMDAR spontaneous activity, reduces the level 
of eEF2 phosphorylation and strengthens synaptic responses. 

a, Representative western blots showing eEF2 phosphorylation (p-eEF2) in 
hippocampal primary cultures. Ket, ketamine; TTX, tetrodotoxin. 

b, Densitometric analysis of p-eEF2 normalized to total eEF2 (left panel). Data 
are expressed as mean percentage + s.e.m. Tetrodotoxin alone does not alter 
p-eEF2, whereas AP5 or ketamine, with or without tetrodotoxin, significantly 
reduce the level of p-eEF2, as assessed by t-test (*, P< 0.05). Right panel: 
application of 1 uM, 5 1M or 50 uM ketamine causes dose-dependent decreases 
in p-eEF2, as assessed by t-test (*, P< 0.05). c, Representative traces of 
NMDAR spontaneous activity after application of 1 1M, 5 uM or 50 1M 
ketamine. d, Quantification of charge transfer (10s) reveals significant effects, 
as assessed by t-test, for all ketamine concentrations compared to controls 

(n = 6-16; *, P< 0.05). e, Field-potential (FP) slopes are plotted as a function of 
time. Representative field-potential traces (average 2 min) are shown during 
baseline (1) and at 45 min (2). The asterisk refers to significantly different field- 
potential values (*, P < 0.05). For statistical analysis, we used two-way repeated 
ANOVA with Bonferroni post hoc analysis. The drug-time interaction was 
significant (F(143,1430) = 6.723, P< 0.001). 
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To determine whether ketamine inhibits NMDA-receptor- 
mediated spontaneous miniature excitatory postsynaptic currents 
(NMDAR-mEPSCs)*™ when at rest, and whether it regulates eEF2 
phosphorylation, we tested its impact on hippocampal neurons in 
vitro. After ketamine perfusion (1 uM, 5 uM or 50 UM), we recorded 
NMDAR-mEPSCs (Fig. 3c) and detected a significant decrease within 
minutes, similar to the effect of AP5 (ref. 23). Moreover, protein 
extracts from ketamine-treated neurons showed decreased eEF2 phos- 
phorylation compared to vehicle-treated cultures, indicating that keta- 
mine, in the absence of neuronal activity, dose-dependently leads to 
eEF2 de-phosphorylation, permitting protein synthesis (Fig. 3a, b). 
Additionally, we evaluated ketamine’s effects on hippocampal field 
potentials. Acute application of ketamine (20 1M, at rest) potentiated 
the synaptic responses subsequently evoked in hippocampal slices 
(Fig. 3e), further showing that increased AMPA-mediated neurotrans- 
mission underlies ketamine’s antidepressant-like behavioural effects. 
This result is consistent with findings regarding BDNF-dependent and 
protein-synthesis-dependent synaptic plasticity”. 
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To examine whether the fast-acting antidepressant response is 
mediated via eEF2, we administered ketamine or MK-801 to wild-type 
mice and analysed eEF2 phosphorylation. Within 30 min, ketamine 
and MK-801 led to rapid decreases in the level of phosphorylated eEF2 
in the hippocampus (Fig. 4a—-c and Supplementary Figs 12 and 13), 
detected by immunostaining and western blot analysis (Fig. 4d). 
However, cortical levels of phosphorylated eEF2 were unaltered after 
acute NMDAR-antagonist treatment (Supplementary Fig. 11f). To 
examine whether eEF2K inhibition alters BDNF protein expression 
in vivo, the eEF2K inhibitors rottlerin or 1-hexadecyl-2-methyl-3- 
(phenylmethyl)-1H-imidazolium iodide (NH125) were administered 
to wild-type mice and the mice were killed 30 min later. Rottlerin and 
NH125 produced significantly increased BDNF protein expression 
(Fig. 4e, g), with corresponding significant decreases in phosphory- 
lated eEF2 in the hippocampus (Fig. 4f, h). To assess directly whether 
eEF2K inhibition is sufficient to mediate fast-acting antidepressant- 
like responses, wild-type mice were treated with rottlerin or NH125 
and then examined in FST. Both rottlerin and NH125 produced sig- 
nificant decreases in FST immobility at 30 min (Fig. 4i), a timescale 
similar to that of the effects of NMDAR antagonists, indicating that 
fast-acting behavioural effects are mediated through eEF2K inhibition. 
To test whether mitogen-activated protein kinase (MAPK), a regulator 
of protein translation during neural activity, affects FST behaviour, we 
treated wild-type mice with the inhibitor SL327. This treatment 
reduced MAPK phosphorylation in hippocampal tissue (Supplemen- 
tary Fig. 10), but did not affect FST behaviour (Fig. 4i), indicating that 
antidepressant-like effects are specific to eEF2K inhibition during rest- 
ing spontaneous glutamatergic signalling. We found that an acute dose 
of rottlerin or NH125 did not affect locomotor activity, but that 
antidepressant-related behavioural effects were long-lasting (Sup- 
plementary Fig. 14a-f). To validate the finding that antidepressant 
effects after eEF2K inhibition were mediated through BDNF, we 
administered rottlerin to Bdnf-knockout mice and tested FST beha- 
viour. Like NMDAR antagonists, rottlerin was ineffective in Bdnf 
knockouts, showing that increased Bdnf expression upon eEF2K 
inhibition is required to produce antidepressant-like behavioural res- 
ponses (Fig. 4j). 

Our data support the hypothesis that ketamine produces rapidly 
acting antidepressant-like behavioural effects through inhibition of 
spontaneous NUDAR-mEPSCs, leading to decreased eEF2K activity, 
thus permitting rapid increases in BDNF translation (Supplementary 
Fig. 15) which may, in turn, exert strong influences on presynaptic or 


Figure 4 | Rapid antidepressant-like behaviour is mediated by decreased 
p-eEF2 and increased BDNF translation. a, Images of CA1 pyramidal and 
stratum radiatum layers after acute treatment with vehicle, ketamine or MK- 
801. Scale bar, 100 tm; red, p-eEF2; blue, DAPI. b, Magnification of stratum 
radiatum; scale bar, 20 um. ¢, ImageJ analysis of average fluorescence intensity 
(a.u., arbitrary units). ANOVA on cell layer, F(2,23) = 13.13, P = 0.0002 for 
treatment; ANOVA on dendrites, F(.,23) = 14.06, P = 0.0001 for treatment 

(n = 4 per group; *, P< 0.05). d, Densitometric analysis of p-eEF2 normalized 
to total eEF2 in the hippocampus after treatment with NMDAR antagonists. 
ANOVA F223) = 3.183, P = 0.03 for treatment (n = 8 per group). 

e-h, Densitometric analyses of BDNF and p-eEF2. Significant increases are 
seen in hippocampal BDNF protein levels (normalized to GAPDH) with 
rottlerin (5 mg kg” ') versus vehicle (e), and with NH125 (5 mg kg” ') versus 
vehicle (g) (t-tests, *, P< 0.05). Significant decreases are seen in p-eEF2 
(normalized to total eEF2) with rottlerin versus vehicle (f) and NH125 versus 
vehicle (h) (t-tests, *, P< 0.05). i, Immobility in FST of wild-type mice given 
acute rottlerin (5 mgkg *) or NH125 (5mgkg_'). ANOVA Fo3,44) = 8.13, 
P= 0.0002 for treatment; Bonferroni post hoc analysis shows significance with 
rottlerin or NH125 versus vehicle (*, P< 0.05), but not with the MAPK 
inhibitor SL327 (10 mgkg '). j, Immobility of Bdnf-knockout mice or 
littermate controls given acute rottlerin (5mgkg~ ') and tested 30 min later in 
FST. ANOVA F(,,19) = 5.77, P = 0.0267 for treatment; Bonferroni post hoc 
analysis for rottlerin versus vehicle-treated controls (*, P< 0.05; n = 5-7 per 


group). 
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postsynaptic efficacy”®’’. We found that fast-acting antidepressant- 
like effects cannot be elicited by disinhibition of behavioural circuitry, 
or by evoked neurotransmission, but must rely on enhanced neuro- 
transmission after NMDAR-antagonist-induced plasticity, occurring 
at rest’®. The observation of behavioural effects mediated through 
spontaneous neurotransmission provides the first evidence that tonic 
resting neurotransmission is involved in behaviour, and supports the 
notion that spontaneous and evoked forms of glutamatergic signalling 
are segregated'*”***”°, These data demonstrate that eEF2K inhibition, 
resulting in de-suppression of protein translation, is sufficient to pro- 
duce antidepressant-like effects, implicating eEF2K inhibitors as 
potential novel major depressive disorder treatments with rapid onset. 
Moreover, our results show that synaptic translational machinery may 
serve as a viable therapeutic target for the development of faster-acting 
antidepressants. 


METHODS SUMMARY 


Behavioural studies were performed using adult male C57BL/6 wild-type or 
mutant mouse strains, maintained as previously described’*”’. All drugs were 
administered via intraperitoneal injection. Antidepressant-like behaviour was 
assessed using the forced swim test, as previously described’. Briefly, animals were 
placed in a cylinder of water at 22-24 °C for 6 min and immobility was measured 
during the last 4min of the test. Molecular studies consisted of western blot 
analysis or quantitative PCR performed on whole-cell lysates from medial pre- 
frontal cortex or anterior hippocampus. Electrophysiological studies were per- 
formed as previously described in cultured neurons (whole-cell recordings”) or 
in hippocampal slices (field recordings’). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Mice. C57BL/6 male mice aged 6-8 weeks old were habituated to animal facilities 
for 1 week before behavioural testing. Mice were kept on a 12h/12h light/dark 
cycle and were given access to food and water ad libitium. Inducible Bdnf knock- 
outs were generated from a trigenic cross of NSE-tTA, TetOp-Cre and floxed Bdnf 
mice, as previously described'®. Conditional Ntrk2-knockout mice were made by 
crossing CamK-cre(93) (ref. 15) to floxed Ntrk2 mice. For all behavioural testing, 
male mice were 2-4 months old and weight-matched, and groups were balanced 
by genotype. All animal procedures conformed to the guide for the care and use of 
laboratory animals and were approved by the institutional animal care and use 
committee at UT Southwestern Medical Center. 

Drugs. All drugs were injected intraperitoneally. Concentrations were as follows: 
ketamine (Fort Dodge Animal Health) 3.0 mgkg *, MK-801 (Sigma) 0.1 mgkg 
and CPP (Sigma) 0.5 mgkg * in 0.9% saline; anisomycin (Sigma) 100 mgkg * 
(dissolved in HCl/saline, final pH 7.4); actinomycin D (Sigma) 0.5 mgkg * in 5% 
ethanol; rottlerin and NH125 (Sigma) 5 mg kg? in 20-100% DMSO; SL327 
(Sigma) 10mgkg * in 100% DMSO; NMDA (Sigma) 75mgkg ', NBQX 
(Sigma) 10 mgkg ' and picrotoxin (Sigma) 1 mgkg™' in 0.9% saline; rapamycin 
(Sigma) 1.0 mgkg * dissolved in 50% DMSO. 

Sucrose consumption test. Group-housed mice were habituated to a 1% solution 
of sucrose in tap-water for 48 h. The mice were then habituated to water-deprivation 
periods of 4h, 14h and 19h, followed by a 1h exposure to the sucrose solution for 
3 days with intervening access to normal drinking water. To assess individual suc- 
rose intake, the group-housed mice were water-deprived overnight and then housed 
temporarily in a new cage. Each test mouse was placed in its home cage for 1 h with 
access to the 1% sucrose solution. The bottle of sucrose solution was weighed before 
and after the test to determine sucrose intake. A water test was performed ina similar 
manner the following day. Data are expressed as a percentage of sucrose to total 
volume consumed in both sucrose and water trials. 

Elevated plus maze. Mice were placed in the centre of a plus maze (each arm 
33cm X 5.cm) that was elevated 1 m above the floor with two open arms and two 
closed arms (25-cm-tall walls on the closed arms) at 40 lx. The exploratory activity 
was monitored for 5 min with a video tracking system and the duration, in seconds, 
spent in the closed and open arms was recorded by EthoVision software. 
Novelty-suppressed feeding. Briefly, group-housed animals were food-deprived 
for 24h and then placed in a temporary home cage for 30 min. For the test, 
individual mice were placed in a 42 X 42 cm open-field arena at 40 1x. A single 
pellet of the mouse’s normal food chow was placed in the centre of the open-field 
arena. Each animal was placed in a corner of the arena and allowed to explore for 
up to 10 min. The trial ended when the mouse chewed a part of the chow. The 
amount of food consumed in the home cage was taken as the weight of chow 
consumed in 5 min, as a control measure for appetite. 

Context and cued fear conditioning. Fear conditioning was performed as previ- 
ously described’. Briefly, mice were placed in individual chambers for 2 min, 
followed by a loud tone (90 dB) for 30s, immediately followed by a 0.5 mA foot- 
shock for 2s. After 1 min, mice received a second pairing of tone and footshock, as 
described. Mice were placed in home cages until 24h later, when the mice were 
placed back in the same boxes without a tone or shock. The amount of time that the 
animal spent freezing was scored by an observer blind to genotype. Freezing 
behaviour was defined as no movement except for respiration. Four hours later, 
mice were placed in a novel environment with no tone or shock for 3 min, followed 
by 3 min of the tone to assess cue-dependent fear conditioning. Again, time spent 
freezing was recorded as described"®. 

Learned helplessness. Mice were trained on one side of a two-chamber shuttlebox 
(MedAssociates) with the door closed for 1h, receiving 120 variable-interval 
shocks (18-44, average 30s; 0.35mA for 2s) on 2 training days. On the test 
day, the door was raised at the onset of the shock and the shock ended either 
when the mouse stepped through to the other side of the shuttlebox or after 25s. 
Latency to step through the door and the number of escape failures were recorded 
for 15 trials. 

Locomotor activity. Mice were placed in cages and locomotor activity was 
recorded for 1 h under red light by photocell beams linked to computer acquisition 
software (San Diego Instruments). 

Forced swim test. The forced swim test (FST) was performed as previously 
described'*. This test is sensitive to conventional antidepressant treatment*’ as 
well as to non-monoaminergic antidepressants*. Mice were placed for 6 min ina 41 
Pyrex glass beaker containing 31 of water at 24+1°C. Water was changed 
between subjects. All test sessions were recorded by a video camera positioned 
on the side of the beaker. The videotapes were analysed and scored by an observer 
blind to group assignment during the last 4 min of the 6 min trial. A decrease in 
immobility time indicates an antidepressant-like response. 

Chronic mild stress. Stressed mice were subjected to two randomly selected mild 
stressors per day, of variable duration (1-12h), for 28 days. Stressors included 


water deprivation, 45° cage-tilt, food deprivation, exposure to rat faeces, cage 
overcrowding, wet bedding, overnight illumination, dark exposure during normal 
light cycle, cold bedding, acoustic disturbance (120 dB), strobe lights and cage- 
mate rotation. Stressors were not applied within 8 h of behavioural testing. 
Time course experiments. Separate cohorts of C57BL/6 adult male mice were 
injected intraperitoneally with vehicle or the NMDAR antagonists ketamine 
(3.0 mgkg~'), MK-801 (0.1 mgkg~') or CPP (0.5 mgkg™') at 30min, 3h, 24h 
or 1 week before FST (n = 10 per group). The drug doses were chosen on the basis 
of previous literature demonstrating an antidepressant-like response in mouse 
models’. 

Anisomycin and actinomycin D experiments. Separate cohorts of C57BL/6 
adult male mice were injected intraperitoneally with either vehicle or anisomycin 
(100 mgkg™'), or with either saline or actinomycin D (0.5 mgkg '), 1h before 
FST. Thirty minutes before testing, mice received either a saline or a ketamine 
injection (3.0 mg kg") (n = 10 per group). For 24h experiments, mice were given 
anisomycin (100 mg kg‘) or saline 30 min before an injection of ketamine and 
were tested in the FST 1 day later. 

Inducible Bdnf&knockout experiments. Separate cohorts of inducible Bdnf 
knockout adult male mice and wild-type littermate controls were subjected to 
FST either 30 min or 24h after injection with saline, ketamine (3.0 mgkg ‘) or 
MK-801 (0.1 mgkg”') (n = 7-12 per group). 

Quantitative RT-PCR. Fresh frozen anterior hippocampal slices (2 per mouse, 
~1mm thick) were dissected and total RNA was extracted using Trizol reagent 
(Invitrogen), according to manufacturer’s instructions. Conditions for cDNA 
synthesis, amplification and primer sequences were described previously’*. The 
fold-change in Bdnf expression (coding exon) was normalized to GAPDH. 
Protein quantification. Anterior hippocampal slices (2 per mouse, ~1 mm thick) 
were dissected from C57BL/6 mice that had received saline vehicle, ketamine 
(3.0mgkg~') or MK-801 (0.1 mgkg™') injections, either 30 min or 24h after 
injection. The slices were rapidly frozen and lysed in buffer containing protease 
inhibitors and phosphatase inhibitors. Total protein concentration was quantified 
by Bradford analysis. BDNF quantification was carried out by SDS—polyacrylamide 
gel electrophoresis. Primary antibodies for BDNF (Santa Cruz Biotechnology) and 
GAPDH (Cell Signaling) were used at dilutions of 1:200 and 1:10,000, and anti- 
rabbit secondary antibodies were used at 1:2,000 and 1:50,000, respectively. To 
measure phosphorylated eEF2 (p-eEF2, Thr 56) and total eEF2, primary antibodies 
were used at dilutions of 1:1,000 and anti-rabbit secondary antibodies were used at 
1:2,000. Mouse anti-ARC (C7, Cell Signaling) was used at a primary dilution of 
1:1,000 and secondary dilution of 1:2,000. Phospho-mTOR and total mTOR 
(Cell Signaling) were both used at primary dilutions of 1:500 and secondary dilu- 
tions of 1:10,000. GluR1 (Chemicon) was used at a primary dilution of 1:5,000 and 
secondary dilution, 1:2,000. Pan- HOMER antibody (Cell Signaling) was used at 
1:5,000 with 1:2,000 dilutions for primary and secondary, respectively. Phospho-s6 
kinase and total s6 kinase antibodies were used at 1:200 and 1:5,000 for primary 
dilutions, respectively, and both had secondary dilutions of 1:5,000 (Cell Signaling). 
Phospho-MAPK and total MAPK antibodies (Cell Signaling) were used at primary 
dilutions of 1:10,000 and 1:500 respectively and both had secondary dilutions of 
1:2,000. Bands developed with enzymatic chemiluminescence (ECL) were exposed 
to film and films were analysed by ImageJ. BDNF was normalized to GAPDH 
bands, and p-eEF2 and total eEF2 bands were taken as a ratio of GAPDH-normalized 
values. 

Immunohistochemistry. C57BL/6 mice were treated intraperitoneally with saline, 
ketamine (3.0mgkg ') or MK-801 (0.1mgkg ') and killed 30 min later. The 
protocol is adapted from a previous study’’. Brains were fresh-dissected and fixed 
for 72h in ice-cold 4% paraformaldehyde. Brains were cryoprotected for 2 or more 
hours in 20% glycerol, sectioned on a freezing microtome at 30 tm and preserved in 
1X PBS with 0.01% sodium azide. Floating sections were washed in 2X SSC, fol- 
lowed by antigen-unmasking in 50:50 acetone:methanol, performed at 4 °C. 
Sections were rinsed and endogenous peroxidase activity was quenched in 
1% H,0, for 30min. Sections were rinsed in 2X SSC with 0.05% Tween-20. 
Tissue was blocked for 30min in 3% normal goat serum diluted in 2% SSC/ 
0.05% Tween, followed by primary antibody, rabbit anti-p-eEF2 (diluted 1:100 in 
blocking solution; Cell Signaling Technology), and incubation for 48 h at 4 °C. After 
rinsing in 2X SSC, a horseradish-peroxidase-labelled secondary antibody at 1:200 
was applied and the signal was amplified using the tyramide amplification signal 
system (Perkin Elmer). Slides were counterstained with 4',6-diamidino-2- 
phenylindole (DAPI), mounted on superfrost plus slides, dried for 2h and mounted 
in DPX mountant. 

ELISA. A high-sensitivity enzyme-linked immunosorbent assay was used to assess 
BDNF levels, as per manufacturer’s instructions (Promega). Briefly, hippocampal 
lysates were prepared in the recommended buffer, diluted 1:4 in 1X PBS and acid- 
treated as instructed by the manufacturer. A 96-well plate (Nunc) was coated 
overnight in carbonate coating buffer, blocked in the provided sample buffer for 
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2 hat 26 °C and treated with recombinant human BDNE antibody for 2 h at 26 °C. 
Acid-treated samples and provided standards were added to the plate in duplicate. 
Wells were then treated for 1h at RT with anti-IgY conjugated to horseradish 
peroxidase and colour was developed with the provided 3,3’,5,5'-tetramethylben- 
zidine (TMB) solution for 10 min, then stopped with 1M HCl. Absorbance of 
wells was measured at 450 nm. BDNF concentration was determined by compar- 
ing the mean absorbance of the duplicate samples to the standards. BDNF con- 
centration was then normalized to total protein content and expressed as pg of 
BDNF per yg of total protein. 

Cell culture. Dissociated hippocampal cultures were prepared as previously 
described**. Briefly, whole hippocampi were dissected from Sprague-Dawley rats 
at postnatal day 0-3. Tissue was trypsinized (10 mg ml trypsin) for 10 min at 
37 °C, mechanically dissociated by pipetting and plated on Matrigel-coated cover- 
slips. Cytosine arabinoside (41M, Sigma) was added at day 1 in vitro and the 
concentration of cytosine arabinoside was reduced to 2 [1M at day 4 in vitro. All 
experiments were performed on cultures at day 14-21. 

Cell culture recordings. Whole-cell patch-clamp recordings were performed on 
hippocampal pyramidal neurons. Data were acquired using a MultiClamp 700B 
amplifier and Clampex 9.0 software (Molecular Devices). Recordings were filtered 
at 2 kHz and sampled at 200 ps. A modified Tyrode’s solution containing 150 mM 
NaCl, 4mM KCl, 2mM MgCh, 2 mM CaCl, 10 mM glucose and 10 mM HEPES, 
pH 7.4, was used as external bath solution. The pipette-internal solution contained 
115mM CsMeSO3, 10 mM CsCl, 5mM NaCl, 10mM HEPES, 0.6mM EGTA, 
20 mM tetraethylammonium chloride, 4 mM Mg-ATP, 0.3 mM Na3GTP, pH 7.35, 
and 10mM QX-314 (N-(2,6-dimethylphenylcarbamoylmethyl)-triethylammo- 
nium bromide), 300 mosM. Series resistance was 10-30 mQ. To record and isolate 
NMDAR-mEPSCs, the MgCl, concentration was reduced to 0.1mM and 2,3- 
dihydroxy-6-nitro-7-sulfamoyl-benzo(f)quinoxaline-2,3-dione (NBQX; 10 uM, 
Sigma) and picrotoxin (50 11M; Sigma) were added to the bath solution to block 
AMPA-receptor-mediated excitatory currents and GABA (y-aminobutyric acid) 
receptor-mediated inhibitory currents, respectively. The baseline for the analysis 
of NMDAR-mEPSCs was automatically determined as the average current level of 
silent episodes during a recording. The events were selected at a minimum thresh- 
old of 4 pA and the area under current deflection was calculated to quantify charge 
transfer’®. 

Field recordings. Field recordings were made from hippocampal slices from 
Sprague-Dawley rats obtained from Charles River Laboratories. Slices (400 j1m) 
were prepared from rats at 15-25 days old. Rats were anesthetized with euthasol 


LETTER 


(50mg kg ') and decapitated soon after the disappearance of corneal reflexes. The 
brain was removed, dissected and then sliced using a vibratome (1000 Plus) in ice- 
cold dissection buffer containing 26mM KCI, 1.25mM NaH,PO,, 26mM 
NaHCOs;, 0.5mM CaCh, 5mM MgCl, 212mM sucrose and 10mM dextrose. 
Area CA3 was surgically removed from each slice immediately after sectioning. 
The slices were transferred into a reservoir chamber filled with artificial cerebrosp- 
inal fluid (ACSF) containing 124mM NaCl, 5mM KCl, 1.25mM NaH,PO,, 
26mM NaHCO;, 2mM CaCl,, 2mM MgCl, and 10mM dextrose. Slices were 
allowed to recover for 2-3 h at 30 °C. ACSF and dissection buffer were equilibrated 
with 95% O, and 5% CO). For recording, slices were transferred to a submerged 
recording chamber, maintained at 30 °C, and perfused continuously with ACSF at 
arate of 2-3 ml min *. Field potentials were recorded with extracellular recording 
electrodes (1 MQ) filled with ACSF and placed in the stratum radiatum of area 
CAL. Field potentials were evoked by monophasic stimulation (duration, 200 1s) 
of Schaffer collateral/commissural afferents with a concentric bipolar tungsten- 
stimulating electrode (Frederick Haer Company). Stable baseline responses were 
collected every 30s using a stimulation intensity of 10-30 1A, yielding 50-60% of 
the maximal response. After recording 20 min of stable baseline, the stimulation 
was stopped and 20 UM ketamine was applied for 30 min, then stimulation was 
resumed. Field potentials were filtered at 2 kHz, acquired and digitized at 10 kHz 
ona personal computer using custom software (LabVIEW, National Instruments). 
Synaptic strength was measured as the initial slope (10-40% of the rising phase) of 
the field potential. The group data were analysed as follows: (1) the initial slopes of 
the field potential were expressed as percentages of the preconditioning baseline 
average; (2) the timescale in each experiment was converted to the time from the 
end of ketamine application; and (3) the time-matched, normalized data were 
averaged across experiments. 
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Structural basis of steroid hormone 
perception by the receptor kinase BRI 


Michael Hothorn!, Youssef Belkhadir''*+, Marlene Dreux®, Tsegaye Dabi', Joseph. P. Noel**, Ian A. Wilson®® & Joanne Chory'* 


Polyhydroxylated steroids are regulators of body shape and size in higher organisms. In metazoans, intracellular 
receptors recognize these molecules. Plants, however, perceive steroids at membranes, using the membrane-integral 
receptor kinase BRASSINOSTEROID INSENSITIVE 1 (BRI1). Here we report the structure of the Arabidopsis thaliana BRI1 
ligand-binding domain, determined by X-ray diffraction at 2.5A resolution. We find a superhelix of 25 twisted 
leucine-rich repeats (LRRs), an architecture that is strikingly different from the assembly of LRRs in animal Toll-like 
receptors. A 70-amino-acid island domain between LRRs 21 and 22 folds back into the interior of the superhelix to create 
a surface pocket for binding the plant hormone brassinolide. Known loss- and gain-of-function mutations map closely 
tothe hormone-binding site. We propose that steroid binding to BRI generates a docking platform for a co-receptor that 
is required for receptor activation. Our findings provide insight into the activation mechanism of this highly expanded 
family of plant receptors that have essential roles in hormone, developmental and innate immunity signalling. 


Signal perception at the cell surface, and transduction of this signal to 
the cell’s interior, are essential to all life forms. Plants have met this 
challenge in part by evolving membrane-integral receptor kinases. 
Many of these receptors are composed of an extracellular leucine-rich 
repeat (LRR) module and a cytoplasmic kinase domain, connected by 
a single membrane-spanning helix’. Receptors with this architecture 
(LRR-receptor kinase, LRR-RK), for example, regulate plant growth’, 
development** and interactions with the environment”*. Their cor- 
responding ligands range from small molecules’ and peptides® to 
entire proteins’. 

The LRR-RK BRI] (refs 2, 9) controls a steroid signalling pathway 
essential for plant growth’®. Whereas animal steroid receptors are 
found predominantly in the nucleus!', BRI1 is localized at the 
plasma-membrane and in endosomes’. The following model for 
BRII activation has been proposed: in the absence of brassinosteroid, 
BRI1’s kinase domain is kept in a basal state by its auto-inhibitory 
carboxy-terminal tail'’, as well as by interaction with the inhibitor 
protein BKI1 (ref. 14). Hormone binding to the extracellular domain 
of BRI (refs 7, 15), in a region that includes a ~70 amino acid ’island’ 
domain"®, causes a change in the receptor (a conformational change in 
a preformed homodimer’ or receptor dimerization), leading to 
autophosphorylation of the BRI1 kinase domain’, release of its 
C-terminal tail'’, and trans-phosphorylation of the inhibitor BKI1 
(refs 14, 18). BKI1 then dissociates from the membrane, allowing 
BRI1 to interact with a family of smaller LRR-RKs”’, including the 
BRI1 ASSOCIATED KINASE 1 (BAK1)°?!. The kinase domains of 
BRI1 and BAK! trans-phosphorylate each other on multiple sites”, 
and the fully activated receptor triggers downstream signalling 
events”, resulting in major changes in nuclear gene expression”®. 

The architecture of BRI1 is reminiscent of animal Toll-like innate 
immunity receptors (TLRs), and notably several plant LRR-RKs are 
immunity receptors’. It was, thus, reasonable to assume that the BRI1 
ectodomain would form a TLR-like horseshoe structure™* that binds 


its ligand along a dimer interface, as observed in several TLRs**”*. 


Here we report the structure of the ligand-binding domain of 
Arabidopsis BRI1 in its free form and bound to the steroid brassino- 
lide, and show that BRI1 folds into a superhelical assembly, whose 
interior provides the hormone-binding site. Comparison of the free 
and hormone-bound structures, combined with genetic data, suggests 
a novel activation mechanism for BRI1 that is distinct from TLRs. 


Overall structure of the BRI ectodomain 


BRI1 was expressed in baculovirus-infected insect cells and the 
secreted ectodomain was purified by tandem-affinity and size- 
exclusion chromatography. The crystal structure was solved to 
2.5 A resolution by single isomorphous replacement (see Methods, 
Supplementary Table 1 and Fig. 1). BRI1 does not adopt the antici- 
pated TLR-horseshoe structure but forms a right-handed superhelix 
composed of 25 LRRs (Fig. 1a). The helix completes one full turn, with 
arise of ~70 A. The concave surface, which determines the curvature 
of the solenoid”, is formed by «- and 319 helices (green in Fig. 1a) that 
produce inner and outer diameters of ~30 and ~60 A, respectively. 
The overall curvature of BRI1 is similar to that of TLR3 (ref. 24; 
Fig. 1b), but, whereas the TLR3 ectodomain is essentially flat, BRI1 
is highly twisted (Fig. 1b). Such twisted assemblies of LRRs have been 
observed previously with bacterial effector** and adhesion proteins”, 
and with the plant defence protein PGIP*° (Supplementary Fig. 2). 
The twist of PGIP’s LRR domain is caused by a non-canonical, second 
B-sheet that is oriented perpendicular to the central B-sheet forming 
the inner surface of the solenoid*’. Additional B-sheets are also pre- 
sent in our structure (blue in Fig. 1a, Supplementary Fig. 3), but in the 
case of the much larger BRI1 ectodomain result in a superhelical 
assembly (Fig. la). The second f-strand in PGIP and in BRI1 is 
followed by an Ile-Pro spine that runs along the outer surface of the 
helix and provides packing interactions between consecutive LRRs 
(Fig. 2a and ref. 30). Both structural features are directly linked to 
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Figure 1 | The BRI1 ectodomain forms a superhelix. a, Ribbon diagrams of 
the ectodomain. Front, back and top views are shown (left, middle and right, 
respectively). The canonical f-sheet is shown in orange, helices in green, the 
plant-specific B-sheets in blue, and the island domain is depicted in red. 


the Lt/sGxIP consensus sequence of the plant-specific LRR subfamily”" 
(Fig. 2c, Supplementary Fig. 4 and Supplementary Table 2). Because 
this consensus sequence is found in other plant receptor kinases, these 
receptors may also harbour twisted LRR domains (Fig. 2c), making 
BRII the primary template for the study of diverse signalling pathways 
in plants’ ®. 


: > > 


Consensus 


LxxLxxLxLxxNxLSGxIPxxLGx 
1 4 7 Q9 12 14-16 18-19 22-23 


AtBRI1 439 545 
AtBRI1 677 700 
AtBAK1 115 138 
AtCLV 288 311 
AtFLS2 263 286 
AtEFR 416 439 
AtTMM 252 275 
PvPGIP 100 123 


CSELVSLHLSFNYLSGTIPSSLGS 
MP YLF ILNLGHNDISGSIPDEVGD 
LTELVSLDLYLNNLSGPIPSTLGR 
LVSLKSLDLS INQLTGEIPQSFIN 
CSSLVQLELYDNOLTGKIPAELGN 
LLNLOQVVDLYSNAISGEIPSYFGN 
CGSLIKIDLSRNRVTGPIPESINR 
LTQLHYLYITHTNVSGAIPDFLSQ 


Figure 2 | Plant-specific sequence fingerprints cause the superhelical 
arrangement. a, Ribbon diagram of the convex side of LRRs 9-25. The non- 
canonical B-strands and the Ile-Pro spine are shown in dark and light blue, 
respectively. b, Top view of the BRI1 ectodomain. Disulphide bridges are in 
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b, Structural comparison of BRI (C,, trace, in yellow) and TLR3 (in blue; PDB 
1ZIW)™*. The structures superimpose with an root mean square deviation of 
4.2 A between 341 corresponding C,, atoms. Top and side views are shown (left 
and right, respectively). The island domain has been omitted for clarity. 


Amino- and C-terminal flanking regions that cap the hydrophobic 
core of the BRI1 solenoid are similar to caps previously described for 
PGIP* (Supplementary Fig. 5). Notably, not only are these caps stabilized 
by disulphide bridges, but five additional disulphide bonds link con- 
secutive LRR segments in the N-terminal half of the BRI1 ectodomain 
(Fig. 2b, Supplementary Fig. 4 and Supplementary Table 2). 


yellow, the N- and C-terminal caps are in pink. c, Sequence alignment of LRRs 
in BRI, other plant receptor kinases**”°*' and PGIP*’. At, Arabidopsis 
thaliana. Pv, Phaseolus vulgaris. The canonical LRR consensus sequences are 
highlighted in blue, plant-specific motifs in green. 


The island domain 


The island domain in BRI1 corresponds to a large insertion in the 
regular repeat-structure between LRRs 21 and 22 (residues 584-654; 
Fig. la). The resulting ~70-residue segment forms a small domain 
that folds back into the interior of the superhelix, where it makes 
extensive polar and hydrophobic interactions with LRRs 13-25 
(Fig. la, Supplementary Fig. 6 and Supplementary Table 2). The 
domain fold is characterized by an anti-parallel B-sheet, which is 
sandwiched between the LRR core and a 349 helix and stabilized by a 
disulphide bridge (Fig. 3a, Supplementary Fig. 4). The loss-of-function 
alleles bril-9 (Ser662Phe, weak)** and bril-113 (Gly611Glu, strong)” 
map to this island domain-LRR interface (Supplementary Fig. 6), and 
probably interfere with folding of the island domain*’. Two long loops 
that connect the island domain to the LRR core appear partially dis- 
ordered in the unliganded receptor (Supplementary Fig. 7). The inser- 
tion of a folded domain into the LRR repeat has not been observed in 
other LRR receptor structures, and is probably an adaptation to the 
challenge of sensing a small steroid ligand (rather than larger ligands, 
such as proteins, nucleic acids, or lipids****). 

We next solved a 2.5A co-crystal structure with brassinolide, a 
potent Arabidopsis steroid that binds BRI1 with nanomolar affinity”**. 
One molecule of brassinolide per BRI1 monomer binds in close prox- 
imity to the island domain (Fig. 3a—c), which was previously implicated 
in steroid binding”’*. Our structure reveals that the LRR superhelix and 
the island domain both extensively contribute to formation of the 
hormone binding site. The A-D rings of the steroid bind to a hydro- 
phobic surface, which is provided by LRRs 23-25 and that maps to the 
inner side of the BRI1 superhelix (Fig. 3b, d, Supplementary Fig. 8). The 
alkyl chain of the hormone fits into a small pocket formed by residues 
originating from LRRs 21 and 22 (Ile 563, Trp 564, Met 657, Phe 658) 
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Figure 3 | The steroid hormone binding site maps to the C-terminal inner 
surface of the superhelix. a, Brassinolide (yellow sticks) binds to a surface 
provided by the LRR domain (in blue) and by parts of the island domain (green 
ribbon). b, Location of the steroid (arrowed) in the centre of the BRI1 
superhelix. c, Close-up view of brassinolide, including an omit 2F, — F, 
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and from two loops connecting the island domain with the LRR core 
(Fig. 3d). The hydrophobic nature and restricted size of this pocket 
now explain why steroid ligands with bulkier or charged alkyl side 
chains, such as the arthropod steroid ecdysone (Supplementary Fig. 8), 
cannot be recognized by BRI1 (ref. 7). A few polar interactions with 
brassinolide’s second diol moiety (Fig. 3d) are established with Tyr 597 
and main-chain atoms from His 645 and Ser 647 in the island domain, 
and are mediated by water molecules (Fig. 3d). Mutation of the neigh- 
bouring Gly 644 to Asp may interfere with this hydrogen-bonding 
network, and explain why this mutation greatly reduces the binding 
activity of the receptor’ and causes the loss-of-function phenotype 
bril-6 (ref. 32; Fig. 3d). No polar contacts are observed with the 
seven-membered B-ring lactone (Fig. 3d), consistent with B-ring 
modifications as found in, for example, castasterone (Supplementary 
Fig. 8) being tolerated by BRI] (refs 7, 35). 

The steroid-complex reveals a hormone-binding site that involves a 
much larger portion of the LRR domain than previously anticipated”®. 
Major interactions between the steroid and the BRI] ectodomain ori- 
ginate from the very C-terminal LRRs 23-25, which bring the hormone 
in close proximity to the membrane (Fig. 3a,d). Importantly, while 
there is a significant hormone-receptor interface (550 A?) for such a 
small molecule ligand, large parts of the steroid are exposed to the 
solvent, including the 2%,30-diol moiety in brassinolide that is import- 
ant for biological activity’. Thus, protein-protein interactions may be 
involved in the recognition of the steroid ligand, with the hormone 
itself providing a docking platform. Importantly, steroid binding 
induces a conformational rearrangement and fixing of the island 
domain, which becomes fully ordered and competent to participate 
in protein-protein interactions that could be critical for receptor 
activation (Supplementary Fig. 7). 
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electron density map contoured at 1.5 o. d, Protein-hormone interactions in 
the BRII steroid binding site. Ribbon diagram of LRRs 21-25 (in grey) is 
shown, together with parts of the island domain (in green). Contacting residues 
are in full side-chain representation, polar interactions are dotted lines, and 
water molecules are red spheres. bril-6 (Gly644Asp) is depicted in magenta. 
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A protein interaction platform 

Four known BRI1 mis-sense alleles map to the inner surface of the last 
five LRRs (Fig. 4a). This surface is not masked by carbohydrate 
(Supplementary Fig. 9), and contains both the hormone-binding site 
and the island domain (Figs 3a, d and 4a). Three mutations cluster in a 
loop connecting the island domain with LRR 22 (Fig. 4a). This loop is 
partially disordered in the unliganded structure but is well-defined in 
the brassinolide complex (Supplementary Fig. 7). We speculate that 
this loop, when ordered, is engaged in protein-protein interactions 
that are critical for receptor activation, and that mis-sense alleles in 
BRI1 modulate these interactions. The gain-of-function allele sud1 
(ref. 36; Gly643Glu) may establish contact with Ser 623 in the island 
domain, and lead to an ordered loop even in the absence of steroid 
ligand (Supplementary Fig. 10). Mutation of the neighbouring Gly 644 
to Asp causes the loss-of-function phenotype bril-6 (ref. 32; see above, 
and Figs 3d, 4a), and mutation of conserved Thr 649 to Lys inactivates 
barley BRI1 (ref. 37). These mutations, when modelled in silico, induce 
steric clashes with residues in the island domain and in the underlying 
LRR domain (Supplementary Fig. 10), and thus may distort the posi- 
tion of the loop. Interestingly, bril-102, a strong loss-of-function muta- 
tion (Thr750lle)** that does not affect steroid binding’, maps to a 
distinct surface area in LRR 25 (Fig. 4a). Thus protein-protein inter- 
actions critical for receptor activation may not be restricted to the 
island domain, but also involve residues from the LRR core. 


Receptor activation 

BRI1 has been shown to exist at least partially as a homo-oligomer in 
planta’**°*°. Thus steroid binding to the island domain and the con- 
comitant rearrangements of the island domain loop could induce a 
conformational change in a preformed BRI] homodimer”, or allow 
for ligand-dependent dimerization of the BRI1 ectodomain. However, 
models of BRI1 dimers that bring the C termini of their ectodomains 
into close proximity (we note that the cytoplasmic kinase domains of 
BRI1 can interact’*"*) and that make use of the interaction surface 
outlined above, encounter steric clashes with the N-terminal LRRs 
(Supplementary Fig. 11). Furthermore, in contrast to TLR ectodo- 
mains, which in crystals tend to form homodimers even in the absence 
of ligand’***, dimers cannot be seen in BRI1 crystals grown under the 
same acidic pH conditions that are typically associated with the plant 
cell wall. The largest interface area between two neighbouring BRI1 
molecules amounts to only ~ 1.5% of the total accessible surface area, 
consistent with the high solvent content of our crystals (see Sup- 
plementary Methods). The main crystal contact involves a head-to- 
head arrangement of two BRI1 monomers, a configuration that would 
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Figure 4 | An accessible membrane-proximal region of BRI1 may provide a 
protein-protein interaction platform. a, Overview of the C-terminal surface 
area not masked by carbohydrate. Brassinolide is shown in yellow, the island 
domain in orange, and genetic alleles in magenta. b, Analytical gel-filtration 
trace (absorbance at 280 nm). The free ectodomain elutes as a monomer (black 
line), as does a putative complex with brassinolide (red line). Void volume (Vo) 
and total volume (V,) are shown, together with elution volumes for molecular 
mass standards (A, aldolase, molecular mass 158,000 Da; B, conalbumin, 
molecular mass 75,000 Da). The calculated molecular mass for the monomer 
peak is ~125 kDa. The molecular mass of purified BRI1 is ~110 kDa. 
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place the cytoplasmic kinase domains far apart (Supplementary Fig. 
12). In size-exclusion chromatography experiments, the recombinant 
BRI1 ectodomain elutes as a monomer in the absence of steroid 
ligand, and shows no tendency to dimerize or oligomerize in the 
presence of a ~4 times molar excess of brassinolide (Fig. 4b). 

Our analyses suggest that the superhelical BRI1 LRR domain alone 
has no tendency to oligomerize, indicating that BRI] receptor activation 
may not be mediated by ligand-induced homodimerization of the ecto- 
domain (as described for TLRs**”*) or by conformational changes in 
preformed homodimers'’. We do not dismiss the facts that the cyto- 
plasmic kinase domain of BRI1 can dimerize'*, or that BRI1 homo- 
oligomers are present in vivo'**?*°. However, our structures reinforce 
the notion that homo-oligomerization of BRI1 may be constitutive and 
independent of ligand stimulus”’. The presence of an interaction plat- 
form that undergoes conformational changes when steroid binds and 
that harbours several loss- and gain-of-function alleles suggests that 
interaction with another protein factor may control BRI activation. 


Discussion 


The structure of the BRI1 ectodomain offers several new insights, and 
its twisted shape is likely to characterize the architecture of many 
plant LRR-RKs'. The presence of a folded-domain insert appears to 
be an adaptation to the recognition of a small molecule ligand, a 
challenge that smaller LRR proteins have met by generating loop 
insertions into their capping motifs*’. BRI1’s fascinating mode of 
ligand recognition reveals how steroids can be sensed at membranes 
and rationalizes a large set of genetic and biochemical findings*”**”*. 

Different BRI1 receptor activation mechanisms have been proposed, 
including ligand-dependent dimerization as seen for TLRs*’° and 
ligand-induced conformational changes in preformed homodimers”. 
Our analyses suggest that the superhelical shape of the BRI1 ectodo- 
main is incompatible with homodimerization, and that the isolated 
ectodomain behaves as a monomer even in the presence of steroid. 
These findings leave us with the alternative hypothesis that another 
protein factor could bind to the interaction platform in BRI1 that 
would minimally encompass the steroid ligand, LRRs 21-25 and parts 
of the island domain (Fig. 4a). Although it is possible that an unknown 
protein fulfils this role and provides a dimerization interface for two 
BRI1 molecules (as seen, for example, for TLR4; ref. 26), genetic and 
biochemical screens have not uncovered this protein. It is thus possible 
that the small receptor kinase BAK] acts as a direct brassinosteroid co- 
receptor, as suggested previously'®”’. It has been demonstrated that 
BAK] is a genetic component of the brassinosteroid pathway”, that 
BRI1 and BAK] interact in a steroid-dependent manner” and that 
both receptors trans-phosphorylate each other on ligand stimulus”. 
Notably, a homology model of the BAK1 ectodomain (Supplementary 
Fig. 13) is compatible in size and shape with the interaction platform in 
BRI1, and the BAK] elg allele, which maps to the BAK1 ectodomain 
(Supplementary Fig. 14), renders plants hypersensitive to brassinos- 
teroid treatment*. We speculate that the sud1, bril-6, bril-102 and elg 
mutations modulate the interaction between the BRI1 and BAK] ecto- 
domains in a brassinosteroid-dependent manner (Supplementary Fig. 
14). The demonstration that BAK1 is essential for brassinosteroid 
sensing may have been obscured owing to genetic redundancy”, with 
at least two BAK1-like proteins interacting with BRI1 in vivo¥*. We 
recently overcame this limitation by showing that the BRI] inhibitor 
protein BKI1 blocks the interaction between the BAK] and BRI1 
kinase domains'*. Importantly, transgenic lines that constitutively 
deliver BKI1 to the site of BRI1 signalling resemble strong BRI1 loss- 
of-function mutants, suggesting an important role for receptor—co- 
receptor association in brassinosteroid signal initiation’’. 

Future studies will undoubtedly test this heteromerization model 
and dissect the relative contributions of the BRI1 and BAK1 ectodo- 
mains, their transmembrane segments and their cytoplasmic kinase 
domains to receptor activation. It will become important then to 
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understand how BAK] could serve as co-receptor for other LRR-RK 
signalling pathways”. 


METHODS SUMMARY 


The BRI1 ectodomain (residues 29-788) was produced by secreted expression in 
baculovirus-infected insect cells, harvested 4d after infection by ultrafiltration 
and purified by tandem-affinity chromatography, then by gel filtration. BRI1 was 
concentrated to 15 mg ml ' and crystallized by vapour diffusion using a reservoir 
solution containing 14% PEG 4,000, 0.2 M (NH4)2SOx, 0.1 M citric acid (pH 4.0). 
The brassinolide complex was obtained by co-crystallization. Diffraction data to 
2.5 A resolution were collected on a rotating anode X-ray generator and at beam- 
line 8.2.1 of the Advanced Light Source (ALS), Berkeley. The structure was solved 
using the single isomorphous replacement method. Data and refinement statistics 
are summarized in Supplementary Table 1. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Protein expression and purification. A synthetic gene comprising the entire 
BRI1 ectodomain (residues 29-788) and codon optimized for expression in 
Trichoplusia ni was synthesized by Geneart. The gene was cloned into a modified 
pBAC-6 transfer vector (Novagen), providing a glycoprotein 64 signal peptide 
and a C-terminal TEV (tobacco etch virus protease) cleavable Strep-9xHis tan- 
dem-affinity tag. Recombinant baculoviruses were generated by co-transfecting 
the transfer vector with linearized baculovirus DNA (ProFold-ER1, AB vector) 
and amplified in Sf9 cells. The fusion protein was expressed in Hi5 cells using a 
multiplicity of infection of 5, and harvested from the medium 4 days after infec- 
tion by tangential flow filtration using a 30 kDa MWCO (molecular weight cut- 
off) filter membrane (GE Healthcare). BRI1 was purified by sequential Co** (His 
select gel, Sigma) and Strep (Strep-Tactin Superflow high-capacity, IBA) affinity 
chromatography. Next, the tandem-affinity tag was removed by incubating puri- 
fied BRI with recombinant TEV protease in 1:100 molar ratio. The cleaved tag 
and the protease were separated from BRI1 by size-exclusion chromatography on 
a Superdex 200 HR10/30 column (GE Healthcare) equilibrated in 20 mM HEPES 
(pH 7.5), 100 mM NaCl, 1mM EDTA. Monomeric peak fractions were concen- 
trated to ~15mgml! and snap frozen in liquid nitrogen. About 50-80 pg of 
purified BRI1 could be obtained from 11 of insect cell culture. 

Crystallization and data collection. Initial crystals of BRI1 appeared in 18% PEG 
4,000, 0.8 M KCl using the counter diffusion method. Diffraction quality crystals 
of about 300 X 80 X 600 1m could be grown after multiple rounds of microseed- 
ing at room temperature by vapour diffusion in hanging drops composed of 
1.25 pl of protein solution (15mgml~') and 1.25 ul of crystallization buffer 
(14% PEG 4,000, 0.2 M (NH4)2SOu, 0.1M citric acid pH 4.0) suspended above 
1.0 ml of the mother liquor as the reservoir solution. For structure solution 
crystals were stabilized, derivatized and cryo-protected by serial transfer into 
16% PEG 4,000, 1.7M Na malonate (pH 4.0) and 0.5M Nal, and cryo-cooled 
in liquid nitrogen. Single-wavelength anomalous diffraction (SAD) data to 2.9 A 
resolution were collected on a Rigaku MicroMax rotating anode equipped with a 
copper filament (A = 1.5418 A), Osmic mirrors and an R-AXIS IV+ + detector. 
Native crystals were transferred to a cryo-protective solution containing 16% 
PEG 4,000 and 1.7 M Na malonate (pH 4.0) and flash-cooled in liquid nitrogen. 
An isomorphous native data set to 2.5 A was collected at beam-line 8.2.1 
(A = 0.9998A) of the Advanced Light Source (ALS), Berkeley. The hormone- 
bound structure was obtained by dissolving brassinolide (Chemiclones Inc.) to 
a concentration of 1 mM in 100% DMSO. This stock solution was diluted to a 
final concentration of about 50 11M in protein storage buffer (20 mM HEPES pH 
7.5, 100mM NaCl, 1mM EDTA). Purified BRI protein was added to a final 
concentration of about 12.5 uM (1.5 mg ml — ') and the mixture was incubated at 
room temperature for 16h. Next, the complex was re-concentrated to 18 mg 
ml’, and immediately used for crystallization. Crystals appeared under similar 
conditions as established for the unbound form and diffracted again to about 
2.5 A (A = 1.5418 A). Data processing and scaling was done with XDS* (version: 
May 2010) (Supplementary Table 1). 

Structure solution and refinement. The program XPREP (Bruker AXS) was 
used to scale native and derivative data for SIRAS (single isomorphous replace- 
ment with anomalous scattering) analysis. Using data between 30 and 3.7 A, 


SHELXD* located 52 iodine sites (CC All/Weak 42.50/19.82). 16 consistent sites 
were input into the program SHARP” for phasing and identification of 10 addi- 
tional sites at 2.9 A resolution (Supplementary Fig. 1a). Refined heavy atom sites 
and phases were input into phenix.resolve** for density modification and phase 
extension to 2.5 A (final FOM was 0.55). The resulting electron density map was 
readily interpretable (Supplementary Fig. 1b), and the structure was completed in 
alternating cycles of model building in COOT” and restrained TLS refinement in 
phenix.refine (http://www.phenix-online.org). Refinement statistics are sum- 
marized in Supplementary Table 1. The crystals contain one BRI] monomer 
per asymmetric unit with a solvent content of ~60%. The final models comprise 
residues 29-771, with the C termini (residues 772-788) being completely disordered. 
The structure contains 25 LRRs as initially proposed’, and not 24 LRRs as concluded 
from later modelling studies'®. Loop residues 590, 637 and 638 in the island domain 
appear disordered in the unliganded structure. Amino acids whose side chains could 
not be modelled with confidence were truncated to alanine (2% of all residues). 
Analysis with MolProbity® suggested that both refined models have excellent 
stereochemistry, with the free form having 93.3% of all residues in the favoured 
region of the Ramachandran plot, and no outliers (Molprobity score is 2.2 corres- 
ponding to the 90th percentile for structures (N = 6,681) at 2.52 + 0.25 A resolution). 
The brassinolide complex structure has 92.7% of all residues in the favoured region of 
the Ramachandran plot and no outliers (Molprobity score is 2.3 corresponding to the 
86th percentile for structures (N = 6,632) at 2.54 + 0.25 A resolution). 
Size-exclusion chromatography. This was performed using a Superdex 200 HR 
10/30 column (GE Healthcare) pre-equilibrated in 25mM citric acid/sodium 
citrate buffer (pH 4.5), 100 mM NaCl. 100 ul of sample (5 mg ml ') was loaded 
onto the column and elution at 0.6 ml min‘ was monitored by ultraviolet absor- 
bance at 280 nm. Incubation with brassinolide was performed as described in the 
crystallization section. 

Homology modelling. Homology modelling of the AtBAK1 ectodomain (resi- 
dues 27-227; Uniprot http://www.uniprot.org; accession Q94F62) was performed 
with the program MODELLER (http://www.salilab.org/modeller/) using the BRI1 
and PGIP* (PDB 10GQ) structures as template. Structure-based sequence align- 
ments were done using T-COFFEE (http://www.tcoffee.org). BRI1 and BAK1 
share ~35%, PGIP and BAK1 share ~31% sequence identity, with the LRR and 
N-cap consensus sequences being highly conserved. 
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Structure-based design of non-natural amino-acid 
inhibitors of amyloid fibril formation 
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Many globular and natively disordered proteins can convert 
into amyloid fibrils. These fibrils are associated with numerous 
pathologies’ as well as with normal cellular functions’, and fre- 
quently form during protein denaturation*”. Inhibitors of patho- 
logical amyloid fibril formation could be useful in the development 
of therapeutics, provided that the inhibitors were specific enough 
to avoid interfering with normal processes. Here we show that 
computer-aided, structure-based design can yield highly specific 
peptide inhibitors of amyloid formation. Using known atomic 
structures of segments of amyloid fibrils as templates, we have 
designed and characterized an all-D-amino-acid inhibitor of the 
fibril formation of the tau protein associated with Alzheimer’s 
disease, and a non-natural L-amino-acid inhibitor of an amyloid 
fibril that enhances sexual transmission of human immuno- 
deficiency virus. Our results indicate that peptides from struc- 
ture-based designs can disrupt the fibril formation of full-length 
proteins, including those, such as tau protein, that lack fully 
ordered native structures. Because the inhibiting peptides have 
been designed on structures of dual-f-sheet ‘steric zippers’, the 
successful inhibition of amyloid fibril formation strengthens the 
hypothesis that amyloid spines contain steric zippers. 

The finding that dozens of pathologies, including Alzheimer’s disease, 
are associated with amyloid fibrils has stimulated research on fibril 
inhibition. One approach uses the self-associating property of proteins 
that form fibrils to poison fibril formation with short peptide seg- 
ments®*'. A second approach is based on screening for molecules that 
can disrupt fibril formation’*"*. Here we take a third approach to fibril 
inhibition: structure-based design of non-natural peptides targeted to 
block the ends of fibrils. With advanced sampling techniques and by 
minimizing an appropriate energy function, we identify novel candidate 
inhibitors computationally from a large peptide space that interact 
favourably with our template structure. This approach has been made 
possible by the determination of several dozen fibril-like atomic struc- 
tures of segments from amyloid-forming proteins'’*"°. 

These structures reveal a common motif called a steric zipper, 
in which a pair of B-sheets is held together by the interdigitation of 
their side chains'*. Using as templates the steric-zipper structures 
formed by segments of two pathological proteins, we have designed 
inhibitors that cap fibril ends. As we show, the inhibitors greatly slow 
the fibril formation of the parent proteins of the segments, offering a 
route to designed chemical interventions and supporting the hypo- 
thesis that steric zippers are the principal structural elements of these 
fibrils. 

One of the two fibril-like steric zippers that we have chosen as a 
target for inhibitor design is the hexapeptide VQIVYK, residues 306- 
311 of the tau protein, which forms intracellular amyloid fibrils in 
Alzheimer’s disease’’. This segment has been shown to be important 
for fibril formation of the full-length protein and itself forms fibrils 


with biophysical properties similar to full-length tau fibrils**’*”°. Our 
second template for inhibitor design, identified by the “3D profile’ 
algorithm””', is the steric-zipper structure of the peptide segment 
GGVLVN from the amyloid fibril formed by ***PAP**®, a proteolytic 
fragment containing residues 248-286 of prostatic acid phosphatase, a 
protein abundant in semen. ?48p 4 p78 fibrils, also known as semen- 
derived enhancer of virus infection (SEVI), enhance human immuno- 
deficiency virus (HIV) infection by orders of magnitude in cell culture 
studies, whereas the monomeric peptide is inactive”. 

Our computational approach to designing non-natural peptides 
that inhibit fibril formation is summarized in Fig. 1 for the VQIVYK 
segment of tau protein; the same general strategy is used for the 
GGVLVN segment of *“*PAP?*°. In both systems, we design a tight 
interface between the inhibiting peptide and the end of the steric zipper 
to block additional segments from joining the fibril. By sampling L or D 
amino acids, or commercially available non-natural amino acids, we can 
design candidate inhibitors with side chains that maximize hydrogen 
bonding and hydrophobic interactions across the interface. 

We propose that the steric-zipper structures of the VQIVYK and 
GGVLVN segments represent the spines of the fibrils formed by the 
parent proteins containing these segments. Supporting our hypothesis 
are our results that D-amino-acid inhibitors designed on the VQIVYK 
steric-zipper template inhibit fibril formation not only of the VQIVYK 
segment, but also of two tau constructs, K12 and K19**** (Fig. 2a). 
Similarly, the peptide composed of non-natural amino acids designed 
on the GGVLVN template inhibits the fibril formation of ***PAP**° 
and greatly inhibits the HIV infectivity of human cells in culture. 

To design a D-amino-acid hexapeptide sequence that interacts 
favourably with the VQIVYK steric zipper’’, and prevents further addi- 
tion of tau molecules to the fibril, we used the Rosetta software”. This 
led to the identification of four D-amino-acid peptides: p-TLKIVW, 
D-TWKLVL, D-DYYFEF and bD-YVIIER, in which the prefix signifies 
that all o-carbon atoms are in the D configuration (Fig. 2b, c, 
Supplementary Figs 1 and 2 and Supplementary Table 1). In the 
D-TLKIVW design model (Fig. 2b, c and Supplementary Fig. 1), the 
inhibitor packs tightly across the top of the VQIYVK steric-zipper 
structure, maintaining all main-chain hydrogen bonds. The side-chain 
hydrogen bonding between layers of stacked Gln 307 residues is 
replaced in the designed interface by an interaction with pD-Lys 3. 
Several hydrophobic interactions between D-TLKIVW and the two 
VQIVYK f-strands contribute to the favourable binding energy 
(Supplementary Table 1). In the design, the D-peptide blocks the addi- 
tion of another layer of VQIVYK, both above the D-peptide and across 
on the mating B-sheet (Supplementary Fig. 3). D-Leu 2 of the designed 
inhibitor prevents the addition ofa VQIVYK molecule above it through 
a steric clash with Ile 308 of VQIVYK and on the mating sheet through 
a clash with Val 306 and Ile 308 (Supplementary Fig. 3). These steric 
clashes involving b-Leu 2 are intended to block fibril growth. 
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Figure 1 | Design and characterization of peptide inhibitors of amyloid 
fibril formation. Tau constructs form fibrils in vitro~* (top left; scale bar, 

200 nm). The VQIVYK segment in isolation forms fibrils and microcrystals 
(bottom left; fibril scale bar, 200 nm; microcrystal scale bar, 100 um). The atomic 
structure of the fibril-like VQIVYK segment reveals a characteristic steric-zipper 
motif" comprising a pair of interacting f-sheets (purple and grey) running along 
the fibril axis (grey arrow) (bottom right). We designed a D-amino-acid peptide 
to bind to the end of the steric-zipper template and prevent fibril elongation 
(middle right). The D-peptide (red) is designed to satisfy hydrogen bonds and 
make favourable non-polar interactions with the molecule below, while 
preventing the addition of other molecules above and on the opposite B-sheet. As 
shown in vitro, the designed D-peptide prevents the formation of fibrils when 
incubated with tau K19 (upper right; scale bar, 200 nm). 


We used fluorescence spectroscopy and electron microscopy to 
assess whether the designed D-peptides inhibit the fibril formation of 
the tau segment VQIVYK and of the tau constructs K12 and K19. Of 
our designed inhibitors, D-TLKIVW is the most effective (Supplemen- 
tary Fig. 4). Electron microscopy, performed after three days, verified 
that incubation with equimolar D-TLKIVW prevents K19 fibril forma- 
tion, which would otherwise have occurred within the elapsed time 
(Fig. 1, upper right). D-TLKIVW delays fibril formation of VQIVYK, 
K12 and K19 even when present in sub-equimolar concentration 
(Supplementary Fig. 5). A fivefold molar excess of D-TLKIVW delays 
K12 fibril formation for more than two weeks in some experimental 
replicates (Supplementary Fig. 5c, d). In tenfold molar excess, 
b-TLKIVW prevents the fibril formation of K12 for more than 60 
hours in the presence of preformed K12 fibril seeds, suggesting that 
the peptide interacts with fibrils (Fig. 2d). Also, kinetic analysis shows 
that the fibril elongation rate decreases in the presence of increasing 
concentrations of inhibitor peptide (Supplementary Fig. 6). The large 
increase in lag time in unseeded reactions may be due to interactions 
with small aggregates formed during the process of fibril formation. 

To investigate the specificity of the designed inhibitor, we tested 
scrambled sequence variants of D-TLKIVW that have poor (that is, 
high) calculated energies and unfavourable packing (Supplemen- 
tary Table 1). The scrambled peptides D-TIKWVL, D-TIWKVL and 
D-LKTWIV have little inhibitory effect when present at an equimolar 
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Figure 2 | Designed p-peptide delays tau K12 fibril formation in a sequence- 
specific manner. a, Tau construct composition”’. The longest human tau 
isoform found in the central nervous system, hTau40 (Uniprot ID, P10636-8), 
contains four microtubule-binding repeats, R1 to R4, whereas K12 and K19 
lack R2. The black bars at the amino termini of R2 and R3 represent the 
fibrillogenic segments VQIINK and VQIVYK, respectively. b, The inhibitor 
D-TLKIVW (red) is designed to interact with atoms on both B-strands of the 
VQIVYK steric zipper (grey) primarily through hydrophobic packing and 
hydrogen-bonding interactions. c, The inhibitor interacts with the VQIVYK 
B-strand below. The transparent spheres show where the two molecules 
interact favourably. Black and red dashes indicate main-chain and side-chain 
hydrogen bonds, respectively. Stereo views of b and c are shown in 
Supplementary Fig 1. d, The seeded fibril formation of 50 uM K12 in the 
presence and absence of a tenfold molar excess of peptide was monitored by 
Thioflavin S fluorescence. In the presence of the scrambled peptide D-TIWKVL 
(dark green) and alone (black), seeded K12 fibril formation occurs with almost 
no lag time. However, D-TLKIVW prevents fibril formation for days (maroon). 
e, At equimolar concentrations, D-TLKIVW (red) inhibits the fibril formation 
of 50 uM K12. D-TIKWVL (blue), with only three residues scrambled, shows 
weak inhibition. However, no inhibition is observed for either D- TIWKVL 
(green) or D-LKTWIV (cyan). f, The replacement of D-Leu 2, designed to clash 
with VQIVYK on the opposite sheet, with D-Ala eliminates the inhibition of 
fibril formation. 


ratio with VQIVYK, K12 and K19 (Fig. 2e and Supplementary Fig. 7), 
showing that the inhibition is sequence specific. Also, the diastereomer, 
L-TLKIVW, is less effective than D-TLKIVW (Supplementary Fig. 8). 
As a further test of the specificity of our design, we confirmed that 
D-TLKIVW is unable to block the fibril formation of amyloid-f, which 
also is associated with Alzheimer’s disease (Supplementary Fig. 9). This 
suggests that the D-peptide inhibitor is not general to amyloid systems, 
but is specific to the VQIVYK interface in tau protein. Such specificity 
is essential for designed inhibitors if they are not to interfere with 
proteins that natively function in an amyloid state’. 

To confirm that the designed b-peptide inhibits in accordance with 
the design model (Fig. 2b, c and Supplementary Fig. 1), we performed 
several additional tests. First we visualized the position of the inhibitor 
D-TLKIVW relative to fibrils of the tau construct K19 using electron 
microscopy. We covalently linked Monomaleimido Nanogold particles 
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both to the inhibitor and, separately, to a scrambled hexapeptide, 
D-LKTWIV. We used a blind counting assay and found that, relative 
to Nanogold alone, D-TLKIVW shows a significant binding preference 
for the end of fibrils, in contrast to the scrambled control peptide, 
D-LKTWIV (Fig. 3a and Supplementary Fig. 10). 

As a further test of the model, we used NMR to characterize the 
binding affinity of D-TLKIVW for tau fibrils. The 'H NMR spectra for 
D-TLKIVW were collected in the presence of increasing concentra- 
tions of VQIVYK or K19 fibrils. Because neither K19 nor VQIVYK 
contains tryptophan, we were able to monitor the 'H resonance of the 
indole proton of the tryptophan in our inhibitor. When bound to a 
fibril, the inhibitor, p-TLKIVW, is removed from the soluble phase 
and the 'H resonance is diminished’* (Fig. 3b and Supplementary Fig. 
11). As a control, we also measured spectra for the non-inhibiting 
peptide D-LKTWIV present with p-TLKIVW in the same reaction 
mixture. As shown in Fig. 3b, the presence of VQIVYK fibrils at a 
given concentration reduces the D-TLKIVW indole resonance much 
more than it does the D-LKTWIV indole resonance. Spectra of the two 
peptides are shown in Supplementary Fig. 12. By monitoring the 
D-TLKIVW indole resonance over a range of VQIVYK fibril concen- 
trations, we estimate the apparent dissociation constant of the inter- 
action between D-TLKIVW and VQIVYK fibrils to be ~2 uM 
(Supplementary Fig. 1la and Methods). This value corresponds to a 
standard free binding energy of ~7.4kcalmol”’, with ~2.5 kcal 
mol! from non-polar interactions and ~4.9kcalmol~' from six 
hydrogen bonds (Methods). Repeating the NMR binding experiment 
with K19 fibrils yields a similar trend (Supplementary Fig. 11b). To 
determine whether D-TLKIVW has affinity for soluble VQIVYK, we 
measured 'H NMR spectra of D-TLKIVW and p-LKTWIV in the 
presence of increasing amounts of soluble VQIVYK. Only a slight 
change in the respective chemical shifts of the indole proton peaks 
of D-TLKIVW and D-LKTWIV is observed, even at a 70-fold molar 
excess of VQIVYK (Supplementary Fig. 13). This, together with the 
ability of the peptide to prevent seeded fibril formation, suggests that 
D-TLKIVW does not interact with monomers but rather with a struc- 
tured, fibril-like species. 

As another test of our design model, we replaced the D-Leu residue 
with p-Ala in D-TLKIVW. Our structural model suggests that D-Leu 2 
of D-TLKIVW is important for preventing tau fibril formation because 
of its favourable interaction with the Ile residue of the VQIVYK mol- 
ecule below and with Ile and the first Val of VQIVYK across the steric 
zipper (Fig. 2b, c and Supplementary Fig. 1). The p-Ala replacement 
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Figure 3 | Mechanism of interaction. a, Nanogold covalently bound to 
D-TLKIVW localizes at the ends (arrows) of two tau K19 fibrils. Scale bar, 

50 nm. b, The inhibitor D-TLKIVW binds to fibrils with an estimated affinity 
constant in the low micromolar range, as shown by the indole proton region of 
the 500-MHz 'H NMR spectra of D-TLKIVW (9.83 p.p.m.) and D-LKTWIV 
(9.98 p.p.m.) in the presence of increasing concentrations of VQIVYK fibrils. 
The resonance of the D-TLKIVW indole proton is reduced in the presence of 
increasing concentrations of VQIVYK fibrils, whereas the indole proton signal 
for the scrambled control peptide D-LKTWIV is only slightly affected. Fibril 
solutions contained 0-1,500 1M VQIVYK monomers, as indicated. 
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eliminates these interactions and, furthermore, removes a steric clash 
that would occur were another VQIVYK molecule placed across from 
the inhibitor (Supplementary Fig. 3 and Supplementary Table 1). 
When the p-Ala variant is incubated with VQIVYK and the tau con- 
structs, it has no inhibitory effect on fibril formation (Fig. 2f and 
Supplementary Fig. 14). This confirms that D-Leu 2 is critical for the 
efficacy of D-TLKIVW, consistent with our model. 

In summary, although our electron microscopy, NMR and b-Ala 
replacement results support a model in which the designed peptide 
D-TLKIVW binds to the ends of tau fibrils, they do not constitute proof 
that the inhibitors bind exactly as anticipated in the designs (Sup- 
plementary Fig. 15). 

To expand on our design methodology, we computationally 
designed an inhibitor of ***PAP**° fibril formation containing non- 
natural L-amino acids (Fig. 4b and Supplementary Fig. 16), using the 
GGVLVN structure as a template (Fig. 4a and Supplementary Table 2). 
This peptide, Trp-His-Lys-chAla-Trp-hydroxyTic (WW61), contains 
an Ala derivative, B-cyclohexyl-L-alanine (chAla) and a Tyr/Pro 
derivative, 7-hydroxy-(S)-1,2,3,4-tetrahydroisoquinoline-3-carboxylic 
acid (hydroxyTic), both of which increase contact area with the 
GGVLVN template. The non-natural chAla forms hydrophobic inter- 
actions with the Leu residue in the steric-zipper interface, and 
hydroxyTic supports the favourable placement of chAla through 
hydrophobic packing (Fig. 4b and Supplementary Fig. 16b). 
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Figure 4 | Designed non-natural peptide inhibits *“*PAP** fibril 
formation. a, The view down the fibril axis of the crystal structure of the 
GGVLVN steric zipper reveals two mating }-sheets with parallel, in-register 
B-strands (hydrogen bonds, green dashed lines; water molecules, yellow 
spheres). b, View roughly perpendicular to a fibril of three layers, with the 
atoms of the side chains of the top layer shown as purple spheres. On top is a 
designed non-natural peptide inhibitor, ‘Trp-His-Lys-chAla-Trp-hydroxyTic 
(blue; see Supplementary Fig. 16). c, The inhibitor blocks 248p A p?®° fibril 
formation, as shown by monitoring Thioflavin T fluorescence. With a twofold 
molar excess of the inhibitor (pale red), the fluorescence remains low over the 
course of the experiment for all five replicates, unlike in the absence of inhibitor 
(grey). Mean fluorescence values are shown as solid red and black lines with and 
without the inhibitor, respectively. r.f.u., relative fluorescence units. d, HIV 
infection rates were determined by monitoring [-galactosidase (f-gal) activity. 
Agitated 248p a p86 alone efficiently increases viral infection, whereas 

248p Ap*8° mixtures incubated with inhibitor were unable to enhance HIV 
infection. Peptide concentrations during virion treatment are indicated on the x 
axis. Error bars show the s.d. of three measurements per sample. r.l.u., relative 
light units. 
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Moreover, we propose that the bulky side chains and steric constraints 
of hydroxyTic provide hindrance to further fibril growth. 

This designed peptide, WW61, effectively delays both seeded and 
unseeded fibril formation of ***PAP?*° in vitro (Fig. 4c and Sup- 
plementary Figs 17 and 18). In the presence of a twofold molar excess 
of this inhibitor, seeded fibril formation is efficiently blocked for more 
than two days (Fig. 4c). Furthermore, we see that increasing the con- 
centration of this inhibitor extends the fibril formation lag time 
(Supplementary Fig. 19). These inhibition assay results were further 
confirmed by electron microscopy (Supplementary Fig. 20). As a con- 
trol for specificity, we tested the effect of GIHKQK, from the amino 
terminus of **8PAP?®°, and PYKLWN, a peptide with the same charge 
as WW6L1. Neither peptide affected fibril formation kinetics, indicating 
that the inhibitory activity of the designed peptide is sequence specific 
(Supplementary Fig. 21). 

Because *“8PAP*** fibrils (SEVI) have been shown to enhance HIV 
infection”, using a functional assay we investigated whether WW61 is 
able to prevent this enhancement. In this experiment, we treated HIV 
particles with ***PAP*** solutions that had been agitated for 20 hours 
(to allow fibril formation) in the presence or absence of WW61, and 
infected TZM-bl indicator cells. As has been previously observed, 
SEVI efficiently enhanced HIV infection*’. However, 248p 4 p28 incu- 
bated with the designed inhibitor prevented HIV infection (Fig. 4d). 

We performed several control experiments to verify that the lack of 
infectivity observed in the assay is indeed due to the inhibition of SEVI 
formation. First we confirmed that in the absence of SEVI the designed 
inhibitor WW61 does not affect HIV infectivity (Supplementary Fig. 
22a). We also found that the control peptides GIHKQK and 
PYKLWN, which do not inhibit 7*8PAP7®° fibril formation, fail to 
decrease HIV infectivity (Supplementary Fig. 22b). Additionally, we 
observed that WW6I has no inhibitory effect on polylysine-mediated 
HIV infectivity”, further ruling out a non-specific electrostatic inter- 
action mechanism (Supplementary Fig. 22a). Together, these results 
demonstrate that a peptide capable of preventing “**PAP”*° fibril 
formation also inhibits the generation of virus-enhancing material. 

Structure-based design of inhibitors of amyloid fibril formation has 
been challenging in the absence of detailed information about the atomic- 
level interactions that form the fibril spine. So far, one of the most suc- 
cessful structure-based approaches to preventing fibril formation has 
been to stabilize the native tetrameric structure of transthyretin’®. That 
approach is well suited to the prevention of fibril formation of proteins 
with known native structures, but other proteins involved in amyloid- 
related diseases, such as tau protein, amyloid-B and 48p Ap? lack fully 
ordered native structures”. Our structure-based approach makes it 
possible to design inhibitors independent of native structure. Instead, 
the templates are atomic-level structures of short, fibril-forming 
segments'*"’. By using these fibril-like templates, and adopting compu- 
tational methods successful in designing novel proteins and protein— 
protein interfaces”, we have created specific inhibitors of proteins 
that normally form fibrils. These results support the hypothesis that 
the steric zipper is a principal feature of tau-related and SEVI fibrils, and 
suggest that, with current computational methods and steric-zipper 
structures, we have the tools to design specific inhibitors to prevent 
the formation of other amyloid fibrils. 


METHODS SUMMARY 


We used crystal structures of hexapeptide segments of VQIVYK and GGVLVN as 
templates to design peptide inhibitors using the Rosetta software”. Briefly, this 
algorithm searches possible side-chain conformations (called rotamers) of all 
amino acids in a peptide B-strand backbone stacked onto the fibril end of both 
segment structures. The Rosetta software is extended to sample the approximate 
side-chain conformation of non-natural D and L amino acids by adapting side- 
chain torsion angles from those in their natural counterparts. The lowest energy 
set of side-chain rotamers is identified by combinatorial optimization of a 
potential consisting of a term for the Lennard-Jones potential, an orientation- 
dependent hydrogen-bond potential term, an implicit solvation term and a struc- 
ture-derived side-chain and backbone torsional potential term. 
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Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Computational design. Computational designs were carried out using the 
Rosetta software*’ (http://www.rosettacommons.org). This algorithm involves 
building side-chain rotamers of all amino acids onto a fixed protein backbone. 
The lowest energy set of side-chain rotamers is then identified as those which 
minimize an energy function containing a Lennard-Jones potential, an orientation- 
dependent hydrogen-bond potential, a solvation term, amino-acid-dependent 
reference energies and a statistical torsional potential that depends on the back- 
bone and side-chain dihedral angles. 

p-amino-acid tau inhibitors. The crystal structure of VQIVYK (ref. 15; Protein 
Data Bank ID, 2ON9) was used as a starting scaffold for computational design. To 
take full advantage of the statistical nature of the rotamer library and some terms in 
the Rosetta energy function, the stereochemistry of the fibril scaffold was inverted 
so that design would take place using L amino acids. An extended L-peptide was 
aligned with the N, C and O backbone atoms of the D-fibril scaffold. This L-peptide 
was subsequently redesigned, keeping all atoms of the D-fibril fixed. The stereo- 
chemistry of the final design model was then inverted, yielding a D-peptide 
designed to cap an L-fibril. We inspected the finished models to confirm that 
inversion of the stereochemistry at the Thr and Ile CB atoms did not make the 
designs energetically unfavourable. Energetic consequences of incorporating a D 
inhibitor peptide in the middle of an 1 fibril were subsequently evaluated to ensure 
that fibril propagation could not continue after association of an inhibitor. 
Calculations of the area buried and shape complementarity were performed with 
AREAIMOL”' and SC”, respectively. 

L-peptide *“*PAP*** inhibitors. The crystal structure of GGVLVN (PDB ID, 
3PPD) was used as a template for the following design procedure. An extended 
L-peptide was aligned according to crystal symmetry. Small, random perturbations 
of the L-peptide were performed to optimize the rigid-body arrangement between 
the fibril template and the peptide inhibitor. Full sequence optimization of the 
inhibitor was performed using the Rosetta software package, allowing residues 
directly contacting the inhibitor to repack; other scaffold residues remained fixed. 
Because the design calculations use a discrete rotamer representation of the side 
chains, we next performed simultaneous quasi-Newtonian optimization of the 
inhibitor rigid-body orientation, the side-chain torsion angles and, in some cases, 
the backbone torsion angles using the full-atom Rosetta energy function. This 
optimization was essential to the subsequent assessment of the inhibition of the 
design. Several iterative runs of small perturbations in inhibitor placement, inter- 
face design and refinement were performed to improve hydrogen-bonding and 
packing interactions. The designs that ranked highest on the basis of the total 
binding energy between the inhibitor and the fibril scaffold and the interfacial 
shape complementarity” were subsequently synthesized and tested. 

For each initial active L-peptide design, the non-natural L amino acids were 
incorporated using a growth strategy. Non-natural amino acids, structurally sim- 
ilar to those of initial active designs, were selected on the basis of their solubility, 
side-chain shape and commercial availability. Side-chain conformations were 
approximately sampled by adopting side-chain torsion angles from those in their 
natural counterparts. Sequence optimization of the inhibitor was performed and 
the optimal set of rotamers identified using Monte Carlo simulated annealing with 
the full-atom energy function described above. The resulting designs were ranked 
on the basis of the total binding energy between the inhibitor and the fibril scaffold. 
Tau construct expression and purification. pNG2 expression vectors (derived 
from pET-3b*’) containing either the K12 or K19 gene were provided by E. 
Mandelkow™. Expression in BL21(DE3) Escherichia coli** was induced with 
1mM isopropyl thiogalactoside when the absorbance Agoo nm Was between 0.8 
and 1.0, and cells were collected after 3-4h. K12 and K19 were purified on the 
basis of previously described methods*’. Cells were pelleted for 20 min at 4,700g 
and resuspended in 20 mM MES, pH 6.8, 1 mM EDTA, 0.2mM MgCh, 5mM 
DTT, 1mM PMSF and a protease inhibitor cocktail. The cells were sonicated for 
2.5 min and, following addition of NaCl to bring cell lysate to 0.5M NaCl, the 
lysate was boiled for 20 min. The lysate was sedimented at 30,000g for 20 min and 
dialyzed twice against 20mM MES, pH 6.8, 50mM NaCl, 1mM EDTA, 1mM 
MgCl, 2mM DTT and 0.1mM PMSF at 4°C. The dialysate was pelleted for 
20 min at 30,000g and filtered before cation exchange chromatography on an 
AKTA Explorer (GE Pharmacia) with a HighTrap HP SP 5-ml column (GE 
Healthcare). The sample was eluted with a linear gradient of up to 60% buffer B 
(20mM MES, pH 6.8, 1M NaCl, 1mM EDTA, 1mM MgCh, 2mM DTT and 
0.1 mM PMSF). Size exclusion chromatography was optionally performed with a 
Superdex 75 10/300 GL column (GE Healthcare) in PBS buffer (137 mM NaCl, 
3mM KCl, 10mM Na,HPO,, 2mM KH>POx,, pH 7.4) with 1mM DTT on the 
AKTA Explorer depending on preparation purity as assessed by SDS polyacryla- 
mide gel electrophoresis. 

Tau construct inhibition assays. Fibril formation assays were performed on the 
basis of previously published protocols***. Reaction mixtures (150 pl) containing 


50M tau K12 or K19, as determined by the Micro BCA Protein Assay Kit 
(Pierce), were incubated in 250mM sodium phosphate buffer, pH 7.4, with 
1mM DTT, 12.511M heparin (average molecular mass, 6,000 Da; Sigma) and 
10uM Thioflavin S (ThS; MP Bio). Inhibitor peptides (CS Bio, Celtek 
Biosciences) were dissolved in 250mM phosphate buffer, pH 7.4, to 0.5mM 
and added at specified molar ratios. Reactions were split into a minimum of three 
replicates in black, 96-well, optically clear plates (Nunc), sealed with Corning 
pressure-sensitive sealing tape and monitored using either a Varioskan plate 
reader (Thermo Scientific), for K12, or a SpectraMax M5, for K19. The fluor- 
escence signal was measured every 15 min with excitation and emission wave- 
lengths of 440 and 510nm, respectively, at 37°C, with continuous shaking at 
900 r.p.m. with a diameter of 1 mm for K12, and with quiescent incubation with 
shaking 2 s before each reading for K19. Plots showing the fluorescence trace of the 
replicate with median lag time for each sample were created using R”. Plots of lag 
time depict the mean time value at which each replicate crossed an arbitrary 
fluorescence value above noise background (values were selected per experiment 
and applied to all samples). Error bars represent the standard deviation of the 
replicate lag times for each sample. 

Seeded K12 fibril formation assays. Seeds were produced by incubating 50 uM 
K12 as above, but without ThS present, and were added at 0.25% (v/v). Peptide 
stock concentrations were 0.75 mM and were added at a final concentration of 
tenfold molar excess relative to soluble K12. Reaction mixtures were otherwise 
prepared and monitored as above. 

VQIVYK inhibition assays. The VQIVYK fibril formation assay was modified 
from a previously published protocol”. Buffers and plates were kept on ice to delay 
VQIVYK fibril formation while the reaction mixtures were prepared. Replicate 
solutions of 180 pl of 25 mM MOPS, pH 7.2, 100 uM ThS and inhibitor peptides 
were added to black, clear-bottomed, 96-well Nunc plates with 1/8-inch PTFE 
beads (Orange Products). Acetylated and amidated VQIVYK (Genscript) was 
dissolved in H,O to 1.3 mM and filtered through a Millipore Microcon 100-kDa 
filter device at 14,000g for 5 min at 4°C to remove large aggregates (final concen- 
tration, ~1 mM). Filtered VQIVYK (20 ul) was added to each reaction well. ThS 
fluorescence was monitored at room temperature every 2 min using a SpectraMax 
M5 fluorometer with 2 s of mixing before each reading. 

Amyloid-f fibril formation assay. Lyophilized amyloid-B(1-42) was diluted to 
0.2 mgml' in 50mM NH,OH and filtered with a 0.2-1m filter. The reaction 
mixture contained a final concentration of 11.5 11M amyloid-B(1-42), 101M 
Thioflavin T (ThT), 23mM NH,OH in 100mM bicine, pH 9.1, and 11.5 uM 
D-TLKIVW in reactions with peptide present. Reactions were split into four 
replicates and the ThT fluorescence signal was measured every minute (excitation 
wavelength, 440 nm; emission wavelength, 510 nm), at 37°C, with continuous 
shaking at 960 r.p.m. with a 1-mm diameter in a Varioskan fluorometer. 
Electron microscopy. Sample (5 ul) was applied to glow-discharged, 400-mesh 
carbon-coated, formvar films on copper grids (Ted Pella) for 3 min. Grids were 
rinsed twice with distilled water and stained with 1% uranyl acetate for 90 s. Grids 
were examined in a Hitachi H-7000 transmission electron microscope at 75 keV or 
a JEOL JEM1200-EX operating at 80 keV. 

Tau fibril formation kinetic analysis. The nucleation (k,) and propagation (k) 
rates were determined by fitting the form of the Finke-Watzky two-step mech- 
anism’, Plateau values were determined and the remaining parameters were fitted 
using the ‘leasqr’ nonlinear least-squares regression function (http://fly.isti.cnr.it/ 
pub/software/octave/leasqr/) through the OCTAVE software package (http:// 
www.gnu.org/software/octave/). 

Preparation of peptide-gold conjugates. Peptide-Nanogold conjugates were 
prepared as described earlier for similarly sized peptides”. Briefly, 60 nmol of 
the peptides CGGG-(D)-TLKIVW and CGGG-(D)-LKTWIV (CS Bio) were dis- 
solved in 110 tl of phosphate-buffered saline (20 mM, pH6.5, 0.15 M NaCl), added 
to 6nmol of Monomaleimido Nanogold (Nanoprobes), dissolved in 200 tl H2O 
and incubated for 1h at room temperature (22°C) with constant rotation. 
Peptide-Nanogold conjugates were separated from excess unbound peptides by 
membrane centrifugation (Microcon-10 system, Amicon) using a molecular mass 
cut-off of 10 kDa. Peptide-Nanogold conjugates were then diluted into phosphate- 
buffered saline, aliquoted and stored at —20°C for no longer than one month. 
Preparation of K19 fibrils. K19 fibrils were generated by incubating 100 1M 
soluble K19 with 251M 6-kDa heparin overnight at 37°C in phosphate buffer 
(50 mM, pH 7.4). K19 fibrils were sonicated for 15 s, using a microtip set to 35% 
amplitude. Residual heparin and small oligomers were removed by centrifuging 
the mixture through a 100-kDa Microcon concentrator for 10 min at 14,000g, 
washing the retentate with phosphate buffer and repeating three times; the retentate 
was restored to its original volume with phosphate buffer. These short fibril seg- 
ments were stored at 4°C for no longer than one week. For NMR studies, fibril 
samples were similarly prepared, but were washed in H,O and concentrated to 
2mM K19 (by monomer). 


©2011 Macmillan Publishers Limited. All rights reserved 


Preparation of samples for Nanogold binding experiments. Nanogold conju- 
gated inhibitor (or control) (10 nM) was incubated with 1.67 uM K19 fibrils (by 
monomer) in MOPS buffer (25 mM, pH 7.2) for 1 h. We applied 5 1] of it to a glow- 
discharged, 400-mesh carbon-stabilized copper grid (Ted Pella) for 3 min. The 
grids were washed twice with H,O and 10 ul of the Goldenhance reagent was 
applied for 10 s. The grids were washed five times with H,O and negatively stained 
with 2% uranyl acetate. 

Quantification and localization of Nanogold binding. For each sample, 75 
Nanogold particles =15 nm in diameter were counted and classified as bound 
or unbound. The 15-nm cut-off was chosen to exclude unbound, but adjacent, 
particles enlarged by Goldenhance that only apparently bind fibrils. To establish 
the localization of the binding observed, individual Nanogold particles bound to 
fibrils were categorized as bound to the fibril end or side. In both of these experi- 
ments, sample identities were concealed from the microscopist to ensure unbiased 
counting. Grids were examined with a JEOL JEM1200-EX and images were 
recorded using DIGITALMICROGRAPH (Gatan). 

Statistical analysis of Nanogold binding. We compared counts of Nanogold- 
conjugated peptides and unconjugated Nanogold bound to fibrils or localizing to 
fibril ends. Twenty-one unconjugated Nanogold particles out of 75 counted bound 
to fibrils. We modelled Nanogold particles bound to fibrils using a binomial 
distribution with parameters n = 75 (sample size: number of observations) and 
P= 0.28 (probability of success). In a separate experiment, 22 unconjugated 
Nanogold particles bound to fibrils that localized to fibril ends, following a bino- 
mial distribution with n = 105 and P= 0.21. 

Because the number of counts is fairly large, we assumed a normal distribution 
and used a standard Z-test to compare the number of bound Nanogold-peptide 
conjugates with the expected distribution based on the number of bound, uncon- 
jugated Nanogold particles. We used an analogous analysis to determine the 
significance of localization to fibril ends. 

The numbers of Nanogold-p-TLKIVW conjugates bound to fibrils (xpouna = 
43, n = 75) and bound Nanogold-p-TLKIVW conjugates localizing to the end of 
fibrils (%enq = 49, n = 86) were significantly different from the corresponding 
numbers for Nanogold alone, whereas the number of Nanogold~p-LKTWIV 
conjugates bound (xpound = 15, n = 75) or the number localized to fibril ends 
(Xena = 17, n = 100) did not differ significantly from the corresponding numbers for 
Nanogold alone. 

VQIVYK preparation for binding studies. Acetylated and amidated VQIVYK 
peptide (Genscript) was dissolved to 1mM in 25mM MOPS, pH 7.2, and incu- 
bated at room temperature for at least 24h. Fibrils were washed with HO, con- 
centrated using an Amicon ultracentrifugal filter with a 3-kDa molecular mass 
cut-off and resuspended in HO to a final concentration (by monomer) of 4mM. 
Soluble VQIVYK was prepared by dissolving VQIVYK peptide (CS Bio) with free 
amino and carboxy termini in H,O. 

"H NMR sample preparation and measurements. NMR samples were prepared 
with 5% DO and 10mM NaOAc, pH 5.0. p-peptides were added from 1mM 
stocks in H,O to a final concentration of 100 1M. Soluble and fibrillar VQIVYK 
and tau protein were added at indicated concentrations to make a final volume of 
550 ul. 'H NMR spectra measured at 500 MHz were collected ona Bruker DRX500 
at 283K. H,O resonance was suppressed through presaturation. Spectra were 
processed with XWINNMR 3.6. 

Binding constant estimations. NMR data were analysed to estimate a binding 
constant for the interaction between p-TLKIVW and VQIVYK fibrils. At about 
1,000 uM VQIVYK (concentration as monomer), 50% of D-TLKIVW is bound 
(Supplementary Fig. 11). The steric-zipper model suggests that there are two 
monomers per 4.7 A (0.47-nm) layer in a fibril''—such that the number of mono- 
mers per fibril is given by [fibril length (nm)] (2 monomers per 0.47 nm)—and 
we estimate the fibril concentration using the monomer concentration: 
[VQIVY Kgpritl] = [VQUIVYK nomomer!/(monomers per fibril). If we assume one 
binding site and estimate from electron microscopy an average length of 
~140 nm per fibril, then there are about 600 monomers per fibril, and the appar- 
ent dissociation constant is about 2 uM. 

Hydrogen-bonding energy calculation. We used AREAIMOL" to calculate the 
non-polar and polar areas buried by the interaction between D-TLKIVW with the 
VQIVYK steric zipper (Fig. 2b, c and Supplementary Fig. 1). We calculate buried 
areas of 201, 24 and 102 A? for carbon, nitrogen and oxygen atoms, respectively. 
Using the atomic solvation parameters of ref. 43, we estimate that the free energy of 
transferring the inhibitor from a non-polar phase to an aqueous phase, AG,oivation» 
is approximately 2.5 kcal mol” *. On the basis of an apparent dissociation constant 
of 2 4M, we estimate the total free energy change of bringing the inhibitor into 
contact with the VQIVYK steric-zipper template, AGyinaing, to be 7.4 kcal mol |. 
From the interaction model (Fig. 2c and Supplementary Fig. 1), we maintain six 
hydrogen bonds between D-TLKIVW and VQIVYK, and estimate the free energy 
change per hydrogen bond to be (AGpinding — AGsotvation)/6, or ~0.8 kcal mol |. 
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GGVLVN crystallization and structure determination. The GGVLVN peptide 
was dissolved in 10 mM Tris, pH 9, at 1.8mg ml‘ and crystallized in 10% (w/v) 
PEG-8000, 0.1 M MES, pH 6.0, and 0.2M Zn(OAc)2. X-ray diffraction data was 
collected at APS beamline 24-ID-E. Phases were determined by molecular replace- 
ment using an idealized B-strand in PHASER™. Crystallographic refinement was 
performed using REFMAC*. Model building was performed with COOT” and 
illustrated with PYMOL”. 

48p a p*° fibril formation and inhibition. Fmoc-f-cyclohexyl-L-alanine and 
Fmoc-7-hydroxy-(S)-1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid were pur- 
chased from AnaSpec and the inhibitor peptide Trp-His-Lys-chAla-Trp- 
hydroxyTic (WW61) was synthesized by Celtek Biosciences. ***PAP**° and 
WW6I1 were dissolved as X1.25 and X5 stocks in PBS, respectively, and filtered 
with a 0.1-um filter. “*PAP?*° was diluted with PBS to 0.66 mM and ThT was 
added to 10 1M final concentration. Samples were optionally mixed with 1.32 mM 
WW61 and vortexed. Five replicates of 150 pil were immediately dispensed into a 
96-well plate. In dose-response experiments, WW61 final concentrations were 
0.33, 0.66 and 1.32 mM. Plates were continuously agitated at 960 r.p.m. at 37 °C, 
and ThT fluorescence readings were recorded (excitation wavelength, 440 nm; 
emission wavelength, 482 nm) at 15-min intervals with a Varioskan Flash fluo- 
rometer. Lag time was determined when fluorescence crossed an arbitrary value 
(3 r.f.u.) above background. 

Effect of WW61 on fibril-mediated enhancement of HIV-1 infection. The 
CCRS tropic molecular HIV-1 clone NL4_3/92TH014-2* was generated by tran- 
sient transfection of 293T cells with proviral DNA. Supernatants were collected 
48h later and p24 concentrations determined by ELISA. TZM-bl reporter cells 
encoding a lacZ gene under the control of the viral LTR promoter were obtained 
through the NIH AIDS Research and Reference Reagent Program and provided by 
Dr John C. Kappes, Dr Xiaoyun Wu and Tranzyme”’. HIV-1 (40 pl) containing 
0.1 ng of p24 antigen was incubated with 40-1 dilutions of mixtures of “*PAP**° 
and inhibitory peptide, WW61, that was either freshly prepared or had been 
agitated for 23h. Peptide concentrations and experimental conditions during 
agitation were similar to those described above. Thereafter, 20 1l of the mixtures 
were used to infect 180 pl of TZM-bl cells seeded the day before (10° per well). Two 
days later, infection rates were determined by quantifying B-galactosidase 
activities in cellular lysates using the Gal-Screen assay (Applied Biosystems, 
T1027). Luminescence was recorded on an Orion microplate luminometer as 
relative light units per second. 

Effect of WW61 on polylysine-mediated enhancement of HIV-1 infection. 
Polylysine (Sigma Aldrich) (50 pl) was mixed with an equal volume of WW61. 
Thereafter, 35-1] fivefold dilutions of the polylysine-WW61 mixture or polylysine 
alone were incubated with the same volume of virus and incubated for 5 min at 
room temperature. Polylysine-WW61 concentrations were 100, 20, 4, 0.8, 0.16, 
0.032, 0.064 and 0 pg ml! during pre-incubation with virus stocks. Thereafter, 
20 pl of each mixture was added to 180 ul of TZM-bIl cells. The infection rate was 
determined two days later as described above. 

Effect of WW61, GIHKQK and PYKLWN on HIV-1 infection. Each peptide 
(40 ul) was incubated with an equal volume of virus containing 1ng of p24 
antigens for 5 min at room temperature. Peptide concentrations were 150, 30, 6, 
1.2 and Opgml | during pre-incubation with virus stocks. Thereafter, 20 pl of 
each mixture was added separately to 180 tl of TZM-bl cells (tenfold dilution) and 
the infection rate was determined as above. 
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Structural insight into brassinosteroid 
perception by BRI 


Ji She*3*, Zhifu Han'*, Tae-Wuk Kim‘, Jinjing Wang’, Wei Cheng®, Junbiao Chang”, Shuai Shi’, Jiawei Wang', Maojun Yang’, 
Zhi-Yong Wang* & Jijie Chai!? 


Brassinosteroids are essential phytohormones that have crucial roles in plant growth and development. Perception of 
brassinosteroids requires an active complex of BRASSINOSTEROID-INSENSITIVE 1 (BRI1) and BRI1-ASSOCIATED 
KINASE 1 (BAK1). Recognized by the extracellular leucine-rich repeat (LRR) domain of BRI1, brassinosteroids induce a 
phosphorylation-mediated cascade to regulate gene expression. Here we present the crystal structures of BRI1(LRR) in free 
and brassinolide-bound forms. BRI1(LRR) exists as a monomer in crystals and solution independent of brassinolide. It 
comprises a helical solenoid structure that accommodates a separate insertion domain at its concave surface. Sandwiched 
between them, brassinolide binds to a hydrophobicity-dominating surface groove on BRII(LRR). Brassinolide recognition 
by BRII(LRR) is through an induced-fit mechanism involving stabilization of two interdomain loops that creates a 
pronounced non-polar surface groove for the hormone binding. Together, our results define the molecular mechanisms 


by which BRII recognizes brassinosteroids and provide insight into brassinosteroid-induced BRII activation. 


Ubiquitously distributed throughout the plant kingdom, brassinos- 
teroids are a class of low-abundance phytohormones that have crucial 
roles in many aspects of growth and development’*. Studies in 
Arabidopsis have led to the identification of a number of genes 
involved in brassinosteroid perception and signalling*’. One of them, 
BRI, when mutated, abolishes brassinosteroid-mediated responses of 
plants’. BRI1 belongs to a large family of plant LRR receptor-like 
kinases (RLKs) with more than 200 members in Arabidopsis and over 
300 in rice’. LRR RLKs are characterized by an extracellular LRR 
domain, a single-pass transmembrane segment and a cytoplasmic 
kinase domain. BRI1 has been established as a bona fide receptor of 
brassinosteroids by genetic and biochemical investigations*'’’. The 
most convincing evidence for this comes from a biochemical study 
showing that BRI1(LRR) is essential and sufficient for recognition of 
brassinosteroids’””. A 70-residue island domain (residues 587-656) is 
indispensable for brassinosteroid recognition. Binding of brassinos- 
teroids to BRI1 initiates a phosphorylation-mediated cascade, trans- 
ducing the extracellular steroid signal to transcriptional programs'*™*. 

A second gene named BAK1 involved in brassinosteroid signalling 
also encodes an LRR RLK, albeit with only five extracellular LRRs*°. 
Although not involved in brassinosteroid binding, BAK1 promotes 
brassinosteroid-induced signalling by physically associating with 
BRII (refs 4, 5, 15). Several models**>*'*"? of brassinosteroid-induced 
BRI1 activation converge at transphosphorylation within the BRI1- 
BAK1 complex, which involves brassinosteroid-enhanced BRI1 homo- 
dimerization’®, release of BRI1 KINASE INHIBITOR 1 (BKI1)’? from 
BRI1, and BRI1-BAK1 heterodimerization’”’. It is still not fully 
understood how brassinosteroids are perceived by BRII to give rise to 
these events. 

In the present study, we report the crystal structures of free 
BRI1(LRR) and its complex with brassinolide (the most active form 
of brassinosteroid) (Supplementary Table 1). The structures not only 
reveal the molecular mechanisms underlying brassinosteroid recog- 
nition by BRI1 and provide insight into brassinosteroid-induced BRI1 


activation, but also explain the structure-activity relationship of 
brassinosteroids and serve as a foundation for the rational design of 
nonsteroidal mimetics. 


A helical solenoid structure of BRII(LRR) 


The three-dimensional structure of brassinolide-free BRI1(LRR) (resi- 
dues 30-589, 596-643 and 647-772) contains 25 LRRs as predicted’, 
including 24 regular ones and an irregular one abutting the amino- 
terminal side. The LRRs packed in tandem assemble into a highly curved 
solenoid structure (Fig. 1a), with the overall rotation angle about the 
central axis approximately 360 degrees (Fig. 1b). Compared to the regu- 
lar horseshoe-shaped structures of other LRR proteins, one notably 
distinct feature of the BRI1(LRR) structure is that it is exceptionally 
twisted, resulting in formation of a whole turn of right-handed helix 
with an inner diameter of about 30 A. Relative to LRRI1, LRR25 shifts 
about 60 A along the central axis of the helix (Fig. 1a). An unprecedented 
structural feature of BRI1(LRR) is that it has an insertion domain 
anchored to the inner surface of the solenoid and running across six 
LRRs (Fig. 1a). The N-terminal capping domain (residues 30-70), con- 
sisting of one B strand ($1) and two « helices, is integrated into the LRR 
structure by forming an anti-parallel B sheet with the B strand from the 
irregular repeat, whereas the carboxy-terminal capping domain (resi- 
dues 752-772) uses two short helices to tightly pack against the last 
repeat (Fig. la). Eight potential glycosylation sites (Supplementary Fig. 
2) as defined by sufficient electron density were found in BRI1(LRR). An 
assignment of functions to them awaits further investigations. 

Like canonical LRR proteins, the concave surface of the BRI1(LRR) 
solenoid is composed of a parallel B sheet comprising 25 continuously 
running parallel B strands. Compared to many other LRR proteins, 
however, BRI1(LRR) also possesses parallel but distorted B sheets fol- 
lowing the inner f structure in most of the repeats (Fig. 1a, b). This is 
probably due to the plant-specific consensus sequence L/fXGxI/vP (X 
and x stand for polar and any amino acid, respectively)”° of BRI1(LRR)s 
(Supplementary Fig. 2) found in many LRR RLKs, asa similar structural 
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Figure 1 | BRI1(LRR) hasa helical solenoid structure. a, b, Overall structures 
of brassinolide-free BRI1(LRR) shown in two different orientations. The 
N-linked sugars (N-acetylglucosamines) are shown in magenta stick 
representation. Coloured in orange are disulphide bonds. ID, insertion domain. 
The N- and C-terminal cap is shown in slate grey and marine blue, respectively. 


feature also exists in the plant LRR protein PGIP2 (ref. 21). In contrast 
with the concave side, the convex outer surface consists of varied second 
structure elements, including 310 helices, « helices and different length 
of loops (Fig. 1a, b). Some of the loops are stabilized through disulphide 
bonds formed between two consecutive repeats (LRR2-LRR3, LRR5- 
LRR6, LRR7-LRR8, LRR10-LRR11 and LRR14-LRR15) (Fig. la and 
Supplementary Fig. 1). The overall structure of BRI1(LRR) (residues 
30-775) remains nearly unchanged on brassinolide binding, with a root 
mean squared deviation (r.m.s.d.) of 0.70 A over 734 Cx atoms. The 
brassinolide molecule is sandwiched between the insertion domain and 
the concave surface of the solenoid structure (Fig. 1c). 


Interruption of BRII(LRR)s by a folded domain 
Two short segments (residues 590-595 and 644-646) from the inser- 
tion domain do not have interpretable electron density in the struc- 
ture of brassinolide-free BRI1(LRR) but become well defined after 
brassinolide binding (Supplementary Fig. 3). Residues 596-643 form 
a separate folded domain that consists of one three-stranded antipar- 
allel 8B sheet and one 39 helix (Fig. 2a). In addition, this domain 
contains one disulphide bridge (Cys 609-Cys 635) that can further 
contribute to its structural integrity. A database search using the 
Dali server (http://ekhidna.biocenter.helsinki-fi/dali_server) did not 
reveal known structures that share significant homology with the 
insertion domain, indicating that it represents a novel fold. 
Extensive non-covalent interactions including van der Waals and 
polar interactions exist between the insertion domain and the solenoid 
structure. While making few contacts with the remaining parts of the 
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The blue numbers in a indicate the positions of LRRs. The B strand (B1) from 
the N-terminal cap is labelled. c, Overall structure of brassinolide-bound 
BRI1(LRR) with the same orientation as a. Brassinolide molecule is shown in 
stick representation and coloured in yellow. 


insertion domain, the 39 helix packs tightly against Phe 449 and 
Leu 473 from underneath and forms a large network of hydrogen 
bonding interactions with those from one flanking side (Fig. 2b). In 
contrast with the 3,9 helix, 836 primarily binds the concave surface 
(Fig. 2a). Around this binding site, Trp 516, Ile540, Trp 564 and 
Phe 658 establish close contacts with one side of B36, whereas 
Trp 472 and Phe497 wedge between {36 and the 319 helix (Fig. 2b). 
Hydrogen bonds dominate the interactions of the loop linking the 349 
helix and B36 with the concave surface. Additionally, packing of the 
carbonhydrate moiety of the glycosylated Asn 545 against the loop 
connecting the 3,9 helix and B37 seems to have a role in positioning 
the insertion domain in the concave surface (Fig. 2a). Two mutations, 
G613S and S662F, generated weak hormone-insensitive phenotypes’. 
Our structure indicates that they can perturb local conformations and 
consequently generate a deleterious effect on BRI] recognition of bras- 
sinosteroids. Substitution of Gly 613 with serine would produce steric 
clash with the carbonyl oxygen of Ile 600 or the benzene ring of Tyr 599 
and consequently generate a damaging effect on the B sheet of the 
insertion domain (Supplementary Fig. 4). Ser 662 is limited within a 
small pocket and its mutation to the bulky residue phenylalanine is 
expected to generate serious clash with its neighbouring residues, in 
particular the Gly 611 that is N terminal to B36. 


Brassinolide binds to a surface groove on BRII(LRR) 

We used *H-labelled brassinolide to test if the purified recombinant 
BRI1(LRR) was active in binding brassinolide. As shown in Fig. 3a, the 
protein showed a comparable brassinolide-binding activity to that 
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349 helix 


Figure 2 | Interaction of the insertion domain with LRRs. a, Cartoon 
representation of BRI1(LRR) highlighting interaction of the insertion domain 
with the concave surface. The positions of two genetic mutants are shown in red 
and stick representation. b, Detailed interactions of the insertion domain with 
LRRs. The side chains from the insertion domain are shown and labelled in 
purple, and those from LRRs in cyan. Red dashed lines represent hydrogen 
bonds. 


previously reported’*. In the brassinolide-bound BRI1(LRR) struc- 
ture, the electron density unambiguously defines the brassinolide 
molecule that binds to a pronounced hydrophobic groove between 
the insertion domain and the concave side of the solenoid (Fig. 3b). 
Most of the residues lining the surface groove are hydrophobic and 
highly conserved (Supplementary Fig. 1). The fused ring moiety of 
brassinolide (Supplementary Fig. 5) occupies most of the surface 
groove (Fig. 3b). The A ring (see Supplementary Fig. 5) makes mar- 
ginal contacts with the concave surface, whereas the B ring tightly 
stacks Tyr 642 from one lateral side. The other two rings are sand- 
wiched between Phe 681 and Tyr 599. Despite a smaller size, the two 
methyl groups C18 and C19 form much denser hydrophobic inter- 
actions with BRI] by docking into two cavities at the bottom. In 
contrast, the other side of the fused ring is nearly solvent exposed 
(Fig. 3b). Although extensive shape complementarities exist between 
brassinolide and the base of the surface groove, a water molecule fills 
the cavity that is sealed from the solvent region by Lys 601, bridging a 
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Figure 3 | Brassinolide binds a hydrophobic groove between the insertion 
domain and the inner surface of LRRs. a, *H-brassinolide (7H-BL) binding 
activity of BRI1(LRR). About 1 mg ml * BRI1(LRR)-His (red bar) or bovine 
serum albumin (BSA) as control (blue bar) was incubated with 20 nM 7H- 
brassinolide in the absence (—BL) or presence (+BL) of 20 1M unlabelled 
brassinolide. BRI1-bound *H-brassinolide was recovered using nickel beads 
and quantified by scintillation counting. Data represent the average of triplicate 
assays and error bars are standard deviations. b, Detailed interactions between 
brassinolide and BRI1(LRR). Shown in mesh is omit electron density (2.58) 
around brassinolide. The insertion domain and LRRs are coloured in slate grey 
and salmon pink, respectively. The side chains from both the insertion domain 
and LRRs are shown in slate grey. Red spheres represent oxygen atoms of water 
molecules. The three B strands from the insertion domain are labelled. 


hydrogen bond between the carbonyl oxygen at C6 and the hydroxyl 
group of Tyr 599. Standing in contrast with the fused ring, the distal 
carbon atoms C24—C28 of the side chain are completely buried into a 
hydrophobic pocket (Fig. 3b). Further reinforcing the interactions 
around this contact interface, the hydroxyl group at C23 establishes 
a hydrogen bond with the backbone nitrogen of Ser 647 and two 
water-mediated hydrogen bonds with the carbonyl oxygen atoms of 
His 645 and the hydroxyl group of Tyr 597. 


Induced-fit brassinolide recognition by BRI1(LRR) 

The crystals of both free and brassinolide-bound BRI1(LRR)s contain 
one protein molecule per asymmetric unit and have nearly identical 
packing. This indicates that homodimers formed by two crystal- 
lographic symmetry-related molecules (Supplementary Fig. 6) result 
from crystal packing rather than brassinolide binding. Consistently, 
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gel filtration assays showed that BRI1(LRR) was monomeric in solu- 
tion and its apparent molecular weight was not affected by the pres- 
ence of brassinolide (Fig. 4a). The latter is consistent with the 
observation” that brassinolide had no effect on BRI1 homodimeriza- 
tion in protoplasts. Structural comparison revealed that upon brassi- 
nolide binding marked local structural rearrangement occurs to the 
two loops linking the insertion domain and the LRR structure 
(referred to as interdomain loops) (Fig. 4b). Brassinolide-induced 
stabilization of the interdomain loops results in formation of a more 
marked surface groove where brassinolide binds (Fig. 4c), showing 
that brassinolide recognition by BRI1(LRR) is a process of induced fit. 


Discussion 

The hydrophobicity-dominating brassinolide-binding surface groove 
(Fig. 4c, right) indicates that BRI1(LRR) may be accommodating in 
ligand binding. This would afford an explanation for why BRI] is able 
to respond to a variety of brassinolide derivatives. BRI1 ligand 
selectivity, on the other hand, can be conferred by the distal polar 
groups involved in hydrogen bonding interactions as well as the actual 
shape of the hormone-binding groove. The structural mechanism of 
brassinolide recognition by BRI1(LRR) rationalizes (Supplementary 
Fig. 7) the data on the structure-activity relationship of brassinoster- 
oids accumulated during the past decades’, thus opening new per- 
spectives for designing and developing nonsteroidal mimetic of 
brassinosteroids with low cost that can be widely applied in agricul- 
tural practice. 


1004 
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Figure 4 | Brassinolide induces stabilization of two interdomain loops but 
no dimerization of BRI1(LRR). a, Brassinolide (BL) has no effect on the 
oligomeric status of BRI1(LRR) in solution. Top, superposition of the gel 
filtration chromatograms of BRI1(LRR) in the absence (blue) and presence 
(red) of brassinolide. The vertical and horizontal axes represent ultraviolet 
absorbance (A = 280 nm) and elution volume, respectively. Peak fractions are 
highlighted within the dashed square. The apparent molecular weight of 
BRI1(LRR) was 109.4kDa in the presence or absence of brassinolide, higher 
than the theoretical BRI1(LRR) monomer (83 kDa), probably owing to the 
existence of multiple glycosylation sites in BRI1(LRR). DMSO, 
dimethylsulphoxide. Bottom, Coomassie blue staining of the peak fractions 
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Our structural analyses reveal that, although the overall structure of 
BRI1(LRR) remains nearly unchanged upon brassinolide binding, 
significant structural rearrangement occurs to the interdomain loops 
around the brassinolide-binding site (collectively referred to as the 
brassinolide-created surface), which should be associated with 
brassinosteroid-induced signalling. It is not clear how this region 
contributes to brassinosteroid-initiated signalling, but it is probably 
involved in stabilization’® of the kinase-domain-mediated BRI1 
homodimers’’. The absence of an effect of brassinolide on BRI1 
homodimerization in the current study may be due to the isolated 
ectodomain used for assays or requirement of another protein or 
peptide, as indicated by the suppression of bril mutants by a secreted 
peptidase™*. If this is the case, BRI1 would be different from Toll-like 
receptors, in which ligand-induced homodimerization for activation 
can be recapitulated through their isolated extracellular domains”*”®. 
Stabilization of BRI1 homodimers by the brassinolide-created surface 
is expected to initially activate BRI1 kinase to a basal level, resulting in 
tyrosine phosphorylation and disassociation of BKI1 (ref. 19); dis- 
association of BKI1 allows stable association and sequential reciprocal 
phosphorylation between the kinase domains of BRI] and BAK1, fully 
activating the BRI1 kinase as proposed by the sequential phosphor- 
ylation model’®. Alternatively, brassinolide binding may initiate sig- 
nalling by altering an interaction between the extracellular domains of 
BRI and BAK1. A role of the extracellular domain of BAK1 in bras- 
sinosteroid signalling has been indicated by the observations that 
mutations in this domain of BAK] affect brassinosteroid sensitivity” 


Interdomain loops 
Va ow Free BRI1(LRR): 


ID 
BL-bound BRI1(LRR): 
“Sy LRR 

ID 


shown in the top of the panel after SDS-PAGE. MM, molecular mass marker. 
b, Brassinolide binding induces stabilization of two interdomain loops. 
Structural superimposition of the free and brassinolide-bound BRI1(LRR) 
around the brassinolide-binding site. c, Brassinolide binding generates a 
marked hydrophobic surface groove on BRII(LRR). Shown on the left and right 
are the electrostatic surfaces of free BRI1(LRR) and brassinolide-bound 
BRI1(LRR) (shown in the same orientation) around the brassinolide-binding 
site, respectively. The area highlighted with the yellow dashed square on the left 
panel is the brassinolide-binding site. White, blue and red indicate neutral, 
positive and negative surfaces, respectively. 
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and interaction with BRI1’*. Although stable BRI1-BAK] association 
is a consequence of initial BRI1 activation’’, a possible role of BAK1 
and its homologues in initial BRI] activation has not been excluded by 
genetic studies; for example, it remains unclear whether mutations of 
BAK] and its homologues affect brassinolide-induced BKI1 phos- 
phorylation by BRI1. It is interesting to note that BAK1 has five 
LRRs and that the new brassinolide-created surface is also located 
about five LRRs from the membrane surface. Thus, it is also possible 
that the brassinolide-created surface is involved in interaction with 
BAK1(LRR), which triggers BRI1 phosphorylation of BKI1 through 
unknown mechanisms, allowing formation of more stable BRI1- 
BAK] receptor complexes. Although the brassinolide-created surface 
is proposed to be involved in BRII(LRR) homodimerization or 
heterodimerization with BAK1(LRR), further studies are needed to 
determine if brassinosteroids initiate signalling through promoting 
protein-protein interactions, as many other plant hormones do”””. 


METHODS SUMMARY 


The extracellular domain (residues 24-784) of Arabidopsis BRI1 fused with 
6X His at the C terminus was expressed with high five cells and a baculovirus 
expression system. Protein was harvested from the media and purified using Ni- 
NTA followed by size-exclusion chromatography (Superdex200 10/300 GL col- 
umn; GE Healthcare). Crystals of free BRI1 and brassinolide-bound BRI1 were 
grown using the hanging-drop vapour-diffusion method and their structures 
were determined using molecular replacement. *H-labelled brassinolide binding 
assay was performed as previously described’’. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Protein expression and purification. The LRR domain of BRI] (residues 24— 
784) from Arabidopsis with an engineered C-terminal 6x His tag was generated 
by standard PCR-based cloning strategy and its identity was confirmed by 
sequencing. The protein was expressed in high five cells using the vector 
pFastBac 1 (Invitrogen) with a modified N-terminal Hemolin peptide. One litre 
of cells (2.0 X 10° cells ml!) was infected with 20 ml baculovirus using a mul- 
tiplicity of infection of 4 at 22 °C, and protein was harvested from the media after 
48h. The protein was purified using Ni-NTA (Novagen) and size-exclusion 
chromatography (Superdex 200, Pharmacia) in buffer (10mM Tris, pH 8.0, 
100mM NaCl). Samples from relevant fractions were applied to SDS-PAGE 
and visualized by Coomassie blue staining. Protein purification was performed 
at 4 °C. For crystallization of BRI1(LRR), the purified protein was concentrated to 
about 3.0 mg ml’ in buffer containing 10 mM Tris, pH 8.0, 100 mM NaCl. 
Crystallization, data collection, structure determination and refinement. 
Crystals of BRI1(LRR) were generated by mixing the protein with an equal 
amount of well solution (1 ul) by the hanging-drop vapour-diffusion method. 
A mixture of BRI1(LRR) and brassinolide with a molar ratio of 1:10 was used for 
generating crystals of their complex. The initial buffer producing crystals of both 
free BRI1(LRR) and brassinolide-bound BRI1(LRR) contained 0.2 M Na2SO, and 
20% (w/v) polyethylene glycol (PEG) 3,350, which was further optimized by 
adding 10mM trimethylamine-HCl. Crystals grew to their maximum size 
(0.1 X 0.1 X 0.1 mm?) within 10 days at room temperature (20 °C). 

All the diffraction data sets were collected at the Shanghai Synchrotron 
Radiation Facility (SSRF) at beam line BL17U1 using a CCD detector. Crystals 
of both brassinolide-free and brassinolide-bound BRI1(LRR) belong to space 
group C2 with one protein molecule per asymmetric unit. For data collection, 
the crystals were equilibrated in a cryoprotectant buffer containing reservoir 
buffer plus 20.0% (v/v) glycerol. The data were processed using HKL2000 (ref. 
31). Molecular replacement (MR) with the program PHASER” was used to solve 
the crystal structures of both brassinolide-free and brassinolide-bound 
BRI1(LRR). The atomic coordinates of PGIP2 (ref. 21) (PDB code 1OGQ) and 
InlA* (PDB code 106V) were used as the initial searching model. The model 
from MR was built with the program COOT™ and subsequently subjected to 
refinement by the program PHENIX”. The electron density for brassinolide in 
BRI1(LRR)-brassinolide complex crystal became apparent after refinement of 
BRI1(LRR). The final refined model contains residues 30-775 of BRI1(LRR) and 
one brassinolide molecule in brassinolide-bound BRI1(LRR), and residues 30- 
772 in brassinolide-free BRI1(LRR) except the two segments containing residues 
590-595 and 644-646 that have no clear electron density and are presumed to be 
disordered in solution. The structure figures were prepared using PyYMOL”™. 


Gel filtration assay. BRI1(LRR) protein purified as described earlier was sub- 
jected to gel filtration analysis (Superdex200 10/300 GL column; GE Healthcare) 
in the presence and absence of brassinolide. Buffer containing 50 mM NaH,PO,/ 
Na HPO, pH 7.4, 100 mM NaCl, 2.0 mM MgCl, and 5% (v/v) DMSO was used 
for the assay. A mixture of BRI1(LRR) and brassinolide with a molar ratio of 
about 1:10 was used to test the effect of brassinolide on the apparent molecular 
weight of BRI1(LRR). The assays were performed with a flow rate of 0.5 ml min! 
and an injection volume of 0.5 ml buffer containing BRI1(LRR) (0.83 mg ml ') 
with or without brassinolide at 20°C. The column was calibrated using five 
standard proteins, §-amylase 200 kDa, alcohol dehydrogenase 150 kDa, albumin 
66 kDa, carbonic anhydrase 29 kDa and cytochrome C 12.4kDa, which had elu- 
tion volumes (V,) of 12.53 ml, 13.25 ml, 14.25 ml, 16.18 ml and 16.97 ml, respect- 
ively, under the conditions of assay. The void volume (V,) for the column used 
was determined to be 8.15 ml using fresh blue dextran solution (1.0 mgml'). 
The calibration curve of the gel-phase distribution coefficient (Ky) versus log 
molecular weight (M) was obtained using Excel. Kyy = (Ve— Vo)/(Ve— Vo)s 
where V, is the elution volume, V, is the void volume, V. is the geometric column 
volume, 24 ml for the column used. The elution volume of BRI1(LRR) in the 
presence and absence of brassinolide was 13.63 ml under the conditions of assay, 
corresponding to an apparent molecular weight of 109.4 kDa based on the cal- 
ibration equation K,, = —0.2366lgM + 1.538, R? = 0.9814. 

*H-labelled brassinolide-binding assay. His-tagged BRI1(LRR), or BSA as con- 
trol, was incubated with 20 nM *H-brassinolide, or 20 nM *H-brassinolide mixed 
with 20 1M non-radiolabelled brassinolide as competitor, and nickel beads in the 
brassinolide-binding buffer (0.25M mannitol, 10 mM Tris-2-[N-morpholino] 
ethanesulphonic acid (MES), pH 5.7, 5mM MgCl, 0.1mM CaCl,) for 30 min. 
The beads were washed three times, and His-BRI1(LRR) and bound [*H]- 
brassinolide were eluted by using 1M imidazole. The eluted radioactivity was 
measured by scintillation counter. 
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Subunit arrangement and phenylethanolamine 
binding in GluN1/GluN2B NMDA receptors 


Erkan Karakas!, Noriko Simorowski! & Hiro Furukawa! 


Since it was discovered that the anti-hypertensive agent ifenprodil 
has neuroprotective activity through its effects on NMDA (N- 
methyl-D-aspartate) receptors’, a determined effort has been made 
to understand the mechanism of action and to develop improved 
therapeutic compounds on the basis of this knowledge*~*. 
Neurotransmission mediated by NMDA receptors is essential for 
basic brain development and function’. These receptors form 
heteromeric ion channels and become activated after concurrent 
binding of glycine and glutamate to the GluN1 and GluN2 
subunits, respectively. A functional hallmark of NMDA receptors 
is that their ion-channel activity is allosterically regulated by bind- 
ing of small compounds to the amino-terminal domain (ATD) ina 
subtype-specific manner. Ifenprodil and related phenylethanol- 
amine compounds, which specifically inhibit GluN1 and GluN2B 
NMDA receptors®’, have been intensely studied for their potential 
use in the treatment of various neurological disorders and diseases, 
including depression, Alzheimer’s disease and Parkinson’s dis- 
ease**. Despite considerable enthusiasm, mechanisms underlying 
the recognition of phenylethanolamines and ATD-mediated allos- 
teric inhibition remain limited owing to a lack of structural 
information. Here we report that the GluN1 and GluN2B ATDs 
form a heterodimer and that phenylethanolamine binds at the inter- 
face between GluN1 and GluN2B, rather than within the GluN2B 
cleft. The crystal structure of the heterodimer formed between the 
GluN1b ATD from Xenopus laevis and the GluN2B ATD from 
Rattus norvegicus shows a highly distinct pattern of subunit 
arrangement that is different from the arrangements observed in 
homodimeric non-NMDA receptors and reveals the molecular 
determinants for phenylethanolamine binding. Restriction of 
domain movement in the bi-lobed structure of the GluN2B ATD, 
by engineering of an inter-subunit disulphide bond, markedly 
decreases sensitivity to ifenprodil, indicating that conformational 
freedom in the GluN2B ATD is essential for ifenprodil-mediated 
allosteric inhibition of NMDA receptors. These findings pave the 
way for improving the design of subtype-specific compounds with 
therapeutic value for neurological disorders and diseases. 

The consensus view that has emerged from functional studies of 
NMDA receptors using site-directed mutagenesis and molecular model- 
ling is that phenylethanolamine compounds such as ifenprodil and Ro 
25-6981 bind to the ATD of the GluN2B subunit. However, this has not 
been established directly and the mechanism of action is complicated by 
the obligate heteromeric assembly of NMDA receptors. To establish 
directly that phenylethanolamines bind to the ATDs of these receptors, 
we used isothermal titration calorimetry to measure the binding of 
ifenprodil and Ro 25-6981 to purified recombinant Rattus norvegicus 
GluN2B (residues 31-394) and Xenopus laevis GluN1b (residues 23- 
408) ATDs (Supplementary Fig. 1). GluN1b from Xenopus laevis*” was 
used in this study because of its superior biochemical stability compared 
to other orthologues. It is 93% identical in primary sequence to the 
Rattus norvegicus GluN1 ATD and is capable of forming functional 
NMDA-receptor ion channels that undergo ifenprodil inhibition when 
combined with Rattus norvegicus GluN2B’ (Supplementary Fig. 2). 


When the GluN1b ATD or GluN2B ATD proteins were individually 
injected with ifenprodil, there was no evidence of binding (Fig. 1a). 
However, when a mixture of the GluN1b and GluN2B ATD proteins 
was injected with ifenprodil or Ro 25-6981, a dose-dependent heat 
exchange was observed, with dissociation constant (Ky) values of 
320nM and 60 nM, respectively (Fig. la and Supplementary Fig. 3). 
Thus, both the GluN1b and GluN2B ATDs are required for binding 
of phenylethanolamines. 

The necessity of both ATDs for recognition of phenylethanolamine 
indicates that binding takes place in the GluN1-GluN2B heteromer. To 
probe the association pattern of GluN1b and GluN2B ATD proteins, 
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Figure 1 | Binding of phenylethanolamine requires both GluN1b and 
GluN2B ATDs, and stabilizes heterodimers. a, Calorimetric titration of 
ifenprodil into a GluN1b and GluN2B ATD mixture (upper panel) and 
integrated heat as a function of the ifenprodil/protein molar ratio (lower panel) 
for GluN1b ATD (open circles), GluN2B ATD (filled squares) and the GluN1b/ 
GluN2B ATD mixture (filled circles). b, Weighted-average sedimentation 
coefficient (S,) for GluN1b ATD alone (green), GluN2B ATD alone (black) 
and the GluN1b-GluN2B ATD mixture in the presence (cyan) and absence 
(red) of 10 uM ifenprodil, fitted with a monomer-dimer model (lines). 

c, d, Sedimentation equilibrium analysis of GluN1b and GluN2B ATDs in the 
absence (c) and presence (d) of 10 1.M ifenprodil. Data points at a rotor speed of 
18,000 r.p.m. (red dots) are shown with a global fit (black line) of the data. 
Residuals from the fit are shown in the lower panel. 
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we determined the mass of the ATD proteins in solution by sedimenta- 
tion experiments (Fig. 1b-d). Although the individual GluNlb ATD 
and GluN2B ATD were exclusively monomeric at 1.2mgml ' 
(Fig. 1b), they formed a heterodimer with a Ky of 0.7-1 4M when 
mixed together (Fig. 1b, c). Notably, when ifenprodil was included in 
the GluN1b/GluN2B ATD protein mixture, the heterodimerization 
was strengthened by at least 20-fold (Fig. 1b, d). These results establish 
that the GluN1b and GluN2B ATDs form heterodimers and that 
phenylethanolamines probably bind at the GluN1b-GluN2B subunit 
interface. 

To understand the nature of the subunit interaction between 
GluN1b and GluN2B at their ATDs, and to pinpoint the location of 
the phenylethanolamine binding site, we conducted crystallographic 
studies on the GluN1b and GluN2B ATD proteins (Supplementary 
Table 1). The crystallographic analysis showed that the GluN1b and 
GluN2B ATDs exist as heterodimers in both ifenprodil-bound and Ro 
25-6981-bound forms (Fig. 2). No notable structural difference was 
observed between the monomers of GluN1b ATD (Supplementary 
Fig. 4) or GluN2B ATD”° and the respective subunits in the 
GluN1b-GluN2B ATD complex, indicating that dimerization did 
not cause changes in the overall conformation. Most notably, the 
crystal structures clearly identified the phenylethanolamine binding 
site at the heterodimer interface (Fig. 2). 

Both the GluN1b and GluN2B ATDs have bi-lobed clamshell-like 
architectures composed of R1 and R2 domains that are roughly similar 
in secondary-structure distribution to non-NMDA-receptor ATDs'!™. 
However, the structures of the GluN1b and GluN2B ATD monomers 
cannot be superimposed onto non-NMDA-receptor ATD monomers, 
owing to a major difference in the R1-R2 orientations, as was also 
observed previously in a study of the GluN2B ATD monomer’® 
(Supplementary Fig. 5). The unique R1-R2 orientations of the GluN1 
and GluN2B ATDs result in a heterodimer assembly that is distinct 
from that observed in non-NMDA-receptor ATD homodimers'** 
(Supplementary Fig. 6). Whereas non-NMDA-receptor ATD subunits 
form symmetrical homodimers through strong RI-R1 and R2-R2 
interactions, the GluN1b and GluN2B ATDs associate with each other 
asymmetrically through R1-R1 and R1(GluN1b)-R2(GluN2B) inter- 
actions'”? (Fig. 2). No residue from GluN1b R2 is involved in the 
GluN1b-GluN2B interaction (Fig. 2b). The R1-RI1 interface contains 
hydrophobic interactions mediated by residues from the cores of the «2 
helix and «3 helix in GluN1b, and from the a1’ helix and «2’ helix in 
GluN2B, surrounded by polar interactions involving the GluN1b «2 
helix, the GluN2B «1’ helix and the hypervariable loops'® (Supplemen- 
tary Fig. 7). The R1-R2 interface involves mainly polar interactions, 
involving residues on the «10 helix, a loop extending from 12 in 
GluN1b and loops extending from the 6’ sheet and 87’ sheet in 
GluN2B (Fig. 2d). The lack of R2-R2 interaction in the GluN1b and 
GluN2B ATDs leaves sufficient room for the previously suggested con- 
formational movement of the bi-lobed structure in GluN2B!°!°, which 
is important in mediating the allosteric regulation that is unique to 
NMDA receptors. In non-NMDA receptors, such movement is pro- 
hibited, owing to strong R2-R2 interactions that lock the movement of 
R2 (refs 3, 11-13). 

The heterodimeric arrangement of GluN1b and GluN2B creates a 
phenylethanolamine binding pocket composed of residues from 
GluN1b R1, GluN2B R1 and GluN2B R2 (Fig. 2). The phenylethanol- 
amine binding site has no overlap with the zinc binding site that is 
located in the GluN2B ATD cleft’®'® (Supplementary Fig. 8). In the 
crystal structure, ifenprodil is buried in the dimer interface with insuf- 
ficient space for entering or exiting (Fig. 2b), which indicates that 
binding occurs through an induced-fit mechanism and that unbinding 
may involve opening of the GluN2B ATD bi-lobed structure. All of the 
residues at the binding sites are identical among Xenopus laevis, rat and 
human orthologues, indicating that inhibition of NMDA receptors by 
phenylethanolamine is a conserved feature among those species (Sup- 
plementary Fig. 9). Binding of both ifenprodil and Ro 25-6981 is 
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Figure 2 | Structure of the GluN1b-GluN2B ATD heterodimer in complex 
with ifenprodil at 2.6 A resolution. a, View of the ATD heterodimer from the 
side. The GluN1b and GluN2B ATDs have bi-lobed architecture composed of 
R1 (magenta and cyan) and R2 (light pink and yellow) domains. Ifenprodil 
(grey spheres) sits at the heterodimer interface. N-glycosylation chains are 
shown in white. NT, N terminus; CT, C terminus. The cartoon shows an 
approximate orientation of the GluN1b and GluN2B ATDs with black sticks 
below R2 indicating the C-terminal ends where ligand-binding domains 
(LBDs) begin. b, Surface presentation of the GluN1b-GluN2B ATD 
heterodimer (upper panel) and of each subunit (lower panel), showing residues 
at the subunit interface in dark blue. Note that ifenprodil (grey spheres) is 
occluded in the subunit interface. The heterodimer buries 1,191 A? of solvent- 
accessible surface area per subunit, with the GluN1b R1-GluN2B R1 and 
GluN1b R1-GluN2B R2 interfaces contributing 62% and 38%, respectively. 


mediated primarily through hydrophobic interactions between the 
benzylpiperidine group and a cluster of hydrophobic residues from 
the GluN1b «2 helix and «3 helix and the GluN2B «1’ helix and «2’ 
helix, and between the hydroxylphenyl groups and GluN1b Leu 135, 
GluN2B Phe 176 and GluN2B Pro 177 (Fig. 3a, b). Furthermore, the 
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Figure 3 | Phenylethanolamine binding site. a, b, Binding of ifenprodil 

(a) and Ro 25-6981 (b) takes place at the GluN1b-GluN2B subunit interface. 
Mesh represents the F,—F, omit electron density map contoured at 30. 
Residues marked with asterisks in a have been previously shown to affect 
ifenprodil sensitivity. Adjacent to the binding pocket is an empty space 
surrounded by hydrophobic residues, including GluN1b Ala 75, GluN2B Ile 82 
and GluN2B Phe 114 (arrows). c, Comparison of binding patterns of ifenprodil 
(grey) and Ro 25-6981 (lime) in stereoview. The structure bound to Ro 25-6981 
is coloured as in b, whereas the ifenprodil-bound structure is coloured white. 
d, e, New residues found to interact with phenylethanolamines in this study 
were mutated and analysed for their effect on sensitivity to ifenprodil. Mutation 
of the residues surrounding the binding site caused changes in ICso as well as 
changes in the extent of inhibition by ifenprodil. WT, wild-type; I/Ina. relative 
current with (I) and without (Ix) ifenprodil. Error bars represent s.d. 


drugs make three direct polar interactions with Ser 132 of GluN1b, 
Gln 110 of GluN2B and Asp 236 of GluN2B. Superposition of the 
binding sites of ifenprodil and Ro 25-6981 shows that the methyl 
and hydroxyl groups in the propanol moiety of both ligands face in 
opposite directions and that the benzylpiperidine groups sit in the 
binding pocket in similar ways (Fig. 3c). Consequently, Ro 25-6981 
has a higher affinity for GluN1/GluN2B NMDA receptors than ifen- 
prodil’” because the methyl group in Ro 25-6981 is in a favourable 
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position to form a hydrophobic interaction involving Phe 176 and 
Pro 177 in the GluN2B subunit, whereas ifenprodil makes a weaker 
hydrophobic interaction with GluN1b, involving Leu 135. Extensive 
mutagenesis studies have previously indicated that GluN1b Tyr 109 
(ref. 18) and GluN2B Phe 176 and Asp 236 (ref. 19) are critical in 
mediating inhibition by ifenprodil, but whether these are involved in 
binding or transducing the inhibitory effect was unknown. We per- 
formed additional mutagenesis studies on newly identified residues in 
both GluN1b and GluN2B at the ifenprodil binding site, measured 
macroscopic currents by two-electrode voltage clamp, and revealed 
significant alterations in ICs, and in the extent of inhibition (Fig. 3d, 
e and Supplementary Table 2), thereby confirming the physiological 
relevance of the binding site. Notably, disruption of the ‘empty’ hydro- 
phobic space formed by GluN1b Ala 75, GluN2B Ile 82 and GluN2B 
Phe 114 (arrows in Fig. 3a and b) by site-directed mutations to hydro- 
philic residues had marked effects on sensitivity to ifenprofil (Fig. 3d, e). 
Thus, stabilization of this hydrophobic space by filling it with a hydro- 
phobic moiety may be a valid strategy to improve the design of phenyl- 
ethanolamine-based drugs. 

It is not known why phenylethanolamine binds specifically to the 
GluN1-GluN2B subunit combination. Although inspection of the 
primary sequences shows non-conservation of the critical binding-site 
residues between GluN2B and GluN2C or GluN2D (for example, the 
equivalent residue to GluN2B Phe 176 is not conserved in GluN2C or 
GluN2D), all of the residues in GluN2A are conserved except for 
GluN2B Ile 111 (Met112 in GluN2A) (Supplementary Fig. 10). 
Indeed, the mutations GluN2A Met112Ile or GluN2B Ile111Met do 
not confer or abolish ifenprodil sensitivity in GluN1/GluN2A or 
GluN1/GluN2B receptors, respectively (Supplementary Table 2). 
Thus, the insensitivity of the GluN1/GluN2A receptors to phenyletha- 
nolamine may stem from a fundamental difference in the mode of 
subunit association between GluN1/GluN2A and GluN1/GluN2B at 
their ATDs. 

To validate further the physiological relevance of the heterodimeric 
assembly, we engineered cysteine mutants at the subunit interface, 
using the ifenprodil-bound GluN1b/GluN2B ATD structure as a guide, 
in the context of the intact rat GluN1-4b/GluN2B receptor. These 
cysteines were designed to form spontaneous disulphide bonds if the 
mutated residues were proximal to each other. We designed two pairs of 
cysteine mutants, GluN1-4b (Asn70Cys) with GluN2B (Thr324Cys), 
and GluN1-4b (Leu341Cys) with GluN2B (Asp210Cys). These muta- 
tions ‘lock’ the R1-R1 and R1-R2 interfaces, respectively (Fig. 4a). We 
then expressed the mutant receptors in mammalian cell cultures and 
analysed them for formation of disulphide-linked oligomers in western 
blots. When mutant receptors of one subunit were co-expressed with 
wild-type receptors of the other, they gave rise to monomeric bands that 
were identical to wild-type GluN1-4b-GluN2B receptors in both redu- 
cing and non-reducing conditions (110 kDa and 170 kDa for GluN1-4b 
and GluN2B, respectively; Fig. 4b, arrows 2 and 3). In contrast, co- 
expressing pairs of the GluN1-4b-GluN2B cysteine mutants gave rise 
to a heterodimeric ~280 kDa band that was recognized by both anti- 
GluN1 and anti-GluN2B antibodies in non-reducing conditions 
(Fig. 4b, arrow 1). This confirms that the RI-R1 and R1-R2 subunit 
interfaces observed in the GluN1b-GluN2B ATD crystal structures are 
physiological and that the heterodimer, not the homodimer, is the basic 
functional unit in the ATD of the NMDA receptor”. Furthermore, 
disulphide crosslinking was observed in the presence and absence of 
ifenprodil, indicating that the ligand-free GluN1b-GluN2B ATDs may 
oscillate between the previously suggested open conformation’* and 
the closed conformation represented by the crystal structure described 
here. 

To understand the functional effects of locking the RI-R1 and 
R1-R2 interactions in the GluN1b and GluN2B ATDs, we measured 
macroscopic current responses from the ion channels of the cysteine- 
mutant receptors by two-electrode voltage clamp. First, we explored 
the effect on ion-channel activity of breaking the disulphide bonds. 
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Figure 4 | Engineering of disulphide bonds at the subunit interface alters 

sensitivity to ifenprodil. a, Location of mutated residues at the RI-R1 and R1- 
R2 interfaces in the GluN1b and GluN2B ATDs (spheres), and location of the 
ifenprodil binding pocket (asterisk). b, Detection of disulphide bonds by anti- 
GluN]1 and anti-GluN2B western blots in reducing (+DTT) and non-reducing 
(—DTT) conditions. Arrow 1, GluN1-4b-GluN2B heterodimer; arrows 2 and 
4, GluN2B monomers; arrows 3 and 5, GluN1-4b monomers. ¢, Macroscopic 
current recording of the wild-type and mutant receptors in the presence (red) 


Application of dithiothreitol (DTT) had a minor inhibitory effect on 
wild-type GluN1-4b-GluN2B receptors and on receptors contain- 
ing GluN1-4b (Asn70Cys) and GluN2B (Thr324Cys). In contrast, a 
2.5-fold potentiation was observed on breakage of the disulphide bond 
at the R1-R2 interface between GluN1-4b (Leu341Cys) and GluN2B 
(Asp210Cys) (Fig. 4c and Supplementary Fig. 11). This implies that 
locking the closed conformation in the GluN2B ATD bi-lobed struc- 
ture by the R1-R2 crosslink results in downregulation of ion-channel 
activity. We next tested the effects of the disulphide bonds on sensi- 
tivity to ifenprodil. Although the R1-R1 crosslink had only a minor 
effect, the RI-R2 crosslink almost completely abolished inhibition by 
ifenprodil, even at 3 uM (Fig. 4d). When this R1-R2 disulphide cross- 
link was broken by the application of DTT, the mutant receptors 
regained sensitivity to ifenprodil, to a similar extent to that of receptors 
composed of wild-type GluN1-4b and GluN2B (Asp210Cys) in non- 
reducing conditions (Fig. 4d and Supplementary Fig. 12). This indi- 
cates that ifenprodil cannot bind to the GluN1b-GluN2B ATD when 
the R1-R2 interaction is locked and thus, when the GluN2B ATD 
clamshell is closed. Taken together, the experiments described above 
indicate that the binding of ifenprodil requires an opening of the 
GluN2B bi-lobed structure and that inhibition by ifenprodil involves 
closure of the clamshell through the GluN1b R1-GluN2B R2 inter- 
action (Fig. 4e). 
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and absence (black) of DTT (2 mM). d, Effect of disulphide bonds on the 
sensitivity to ifenprodil (IF) of wild-type and mutant receptors in the presence 
(red) and absence (black) of DTT. e, Possible model of ifenprodil binding and 
the movement of ATDs for allosteric inhibition. Ifenprodil binds to the open 
GluN2B clamshell and induces domain closure, resulting in allosteric 
inhibition. In the GluN1-4b (Asn70Cys)—GluN2B (Thr324Cys) receptor, the 
GluN2B ATD is locked in the closed conformation so ifenprodil cannot access 
the binding site. 


This study shows that phenylethanolamine binds at the GluN1- 
GluN2B subunit interface through an induced-fit mechanism and 
that allosteric inhibition involves stabilization of the GluN2B ATD 
clamshell structure in a closed conformation. The binding mechanism 
presented here provides a molecular blueprint for improving the 
design of therapeutic compounds targeting the ATD of the NMDA 
receptor. 


METHODS SUMMARY 


GluN1b and GluN2B ATDs were expressed as secreted proteins using the insect- 
cell/baculovirus system and purified using metal-chelate chromatography and 
size-exclusion chromatography. Crystallization was performed in hanging-drop 
vapour diffusion configuration in a buffer containing 20% PEG3350, 150mM 
KNO; and 50mM HEPES-NaOH (pH 7.0) for the GluN1b ATD, or 3.0-3.5 M 
sodium formate and 0.1 M HEPES-NaOH (pH 7.5) for the GluN1b-GluN2B ATD 
heterodimer. Diffraction data sets obtained at 100 K were indexed, integrated and 
scaled using HKL2000. The GluN1b ATD structure was solved by the single 
anomalous diffraction phasing method using Se-Met-incorporated crystals, and 
the GluN1b-GluN2B ATD structures were solved by molecular replacement using 
coordinates of GluN1b ATD and GluN2B ATD (Protein Data Bank code 3JPW’”). 
Model refinement was conducted using the program Phenix”. Experiments invol- 
ving analytical ultracentrifugation and isothermal titration calorimetry were con- 
ducted using the purified protein samples in their glycosylated form. Ion-channel 
activities of full-length NMDA receptors were measured by whole-cell recording 
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from cRNA-injected Xenopus laevis oocytes, using a two-electrode voltage-clamp 
configuration. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Expression, purification and crystallization of GluN1b and GluN2B ATDs. 
The Xenopus laevis GluNlb ATD (Met1 to Glu 408), containing Cys22Ser, 
Asn61Gln and Asn371Gln mutations, was C-terminally fused to a thrombin 
cleavage site followed by an octa-histidine tag. The Xenopus laevis GluN1b ATD 
and rat GluN2B ATD" constructs were individually expressed or co-expressed 
using the High Five (Trichoplusia ni) baculovirus system (DH10multibac)”. 
Purification was performed using a similar method to that described previously’ 
except that the proteins were de-glycosylated by endoglycosidase F1 after puri- 
fication by metal-chelate chromatography, and 1 1M ifenprodil or 1 1M Ro 25- 
6981 was included in the running buffer of size-exclusion chromatography 
(Superdex200) for isolation of the GluN1b-GluN2B ATD complex. The proteins 
used for isothermal titration calorimetry and sedimentation experiments were 
purified without the endoF1 de-glycosylation step and in the absence of ifenprodil 
or Ro 25-6981. Se-Met-incorporated GluN1b ATD proteins were expressed using 
methionine-free media (ESF921) supplemented with DL-Se-Met (Sigma) at 
100 mg 1! (ref. 10). The GluN1b ATD and GluN1b-GluN2B ATDs were crystal- 
lized by hanging-drop vapour diffusion at 17 °C by mixing the protein (8 mg ml’) 
at a 1:1 ratio with a reservoir solution containing 20% PEG3350, 150 mM KNO, 
and 50mM HEPES-NaOH (pH7.0) for GluN1b ATD, or at a 2:1 ratio with a 
solution containing 3.0-3.5 M sodium formate and 0.1 M HEPES (pH/7.5) for the 
GluN1b/GluN2B ATDs. 

Data collection and structural analysis. Crystals were cryoprotected in buffers 
containing 20% PEG3350, 150 mM KNOs3, 50 mM HEPES-NaOH (pH 7.0) and 
20% glycerol for GluN1b ATD, or 5 M sodium formate and 0.1 M HEPES-NaOH 
(pH 7.5) for GluN1b-GluN2B ATDs. X-ray diffraction data were collected at the 
X25 and X29 beamlines at the National Synchrotron Light Source and processed 
using HKL2000 (ref. 23). Single anomalous diffraction data for the Se-Met- 
incorporated GluN1b ATD crystals were collected at the peak wavelength 
(0.9788 A) and used for phasing by the program SHARP”. The initial model 
was built using flex-wArp”. The crystal structure of GluN1b-GluN2B ATD was 
solved by molecular replacement using the coordinates of GluNlb ATD and 
GluN2B ATD" (PDB code: 3JPW) with the program PHASER”. The models 
were built using COOT” and structural refinement was performed using the 
program PHENIX"!, 

Isothermal titration calorimetry. Proteins were dialysed overnight before the 
experiment against a buffer containing 150 mM NaCl, 20 mM Tris-HCl (pH 7.4) 
and 10% glycerol. Isothermal titration calorimetry measurements were performed 
using VP-ITC (MicroCal) by successive injections at 27°C of 5 ul of 0.15 mM 
ifenprodil to 0.01 mM GluN1b ATD, 10 ul of 0.25 mM ifenprodil to 0.007 mM 
GluN2B ATD, 5 ul of 0.15 mM ifenprodil to 0.01 mM GluN1b-GluN2B ATD 
complex and 5 pl of 0.05mM Ro 25-6981 to 0.007 mM GluN1b-GluN2B ATD 
complex. Data analysis was done using the software Origin 7.0 (Origin Labs). 
Analytical ultracentrifugation. Sedimentation velocity and equilibrium experi- 
ments were performed using a Beckman Coulter Optima XL-I analytical ultra- 
centrifuge. Proteins were dialysed against a buffer containing 150 mM NaCl and 
20mM Tris (pH7.4), with or without 10 1M ifenprodil. Sedimentation velocity 
experiments were performed by centrifuging protein samples loaded on 2-sector 
centrepieces at 42,000 r.p.m. at 20°C. Concentration gradients were measured 
using interference optics or absorbance optics at a wavelength of 280nm or 
230 nm depending on the protein concentrations loaded (0.01, 0.05, 0.1, 0.5 and 
1.2mgml | for GluN1b-GluN2B ATD in the presence and absence of ifenprodil; 
0.1 and 1.2mgml' for GluN1b ATD and 0.1, 0.5 and 5mgml ! for GluN2B 
ATD). Data were analysed using the continuous c(s) and c(M) distribution models 


implemented in Sedfit”*. The weighted-average sedimentation coefficient (Sy) was 


determined from the peak integration of c(s). 

Sedimentation equilibrium experiments were performed using a 6-channel 

centrepiece loaded with 100-1 protein samples at protein concentrations of 
0.05, 0.1 and 0.3mgml' in the presence or absence of 101M ifenprodil. The 
samples were centrifuged sequentially at 9,000, 13,000 and 18,000r.p.m. and 
allowed to reach equilibrium at each speed. Absorbance measurements were per- 
formed at wavelengths 230, 250 and 280 nm to obtain measurements at low and 
high protein concentrations. Global analysis of the data for multiple protein con- 
centrations and rotor speeds was performed using single-species and A + B<> AB 
models implemented in Heteroanalysis v1.1.44 (University of Connecticut). 
Electrophysiology. Recombinant GluN1/GluN2B NMDA receptors were 
expressed by co-injecting 0.1-0.5ng of wild-type or mutant rat GluN1 and 
GluN2B cRNAs into defolliculated Xenopus laevis oocytes. The two-electrode 
voltage-clamp recordings were performed using agarose-tipped microelectrodes 
(0.4-1.0 MQ) filled with 3M KC] at a holding potential of —40 mV. The bath 
solution contained 5mM HEPES, 100mM NaCl, 0.3mM BaCl, and 10mM 
Tricine at pH 7.4 (adjusted with KOH). Currents were evoked by the application 
of glycine and L-glutamate at 100 LM each. Inhibition by ifenprodil was monitored 
in the presence of agonists and various concentrations of ifenprodil. For redox 
experiments, the oocytes were preincubated in the bath solution supplemented 
with 2mM DTT for 3 min before recording in the continuous presence of 2 mM 
DTT. Data were acquired and analysed by the program Pulse (HEKA). 
Cysteine crosslinking and western blot. Single point mutations were incorpo- 
rated into the genes encoding full-length rat GluN1-4b and GluN2B in the pCI 
vector (Promega). Human embryonic kidney 293 cells were transfected by Fugene 
HD (Roche) with a mixture of 0.5 ug of the GluN1-4b plasmid and 1 yg of the 
GluN2B plasmid. Cells were harvested 24-48 h after transfection and resuspended 
ina buffer containing 20 mM Tris-HCl (pH 7.4), 150 mM NaCl, 1% dodecyl-malto- 
side and a protease-inhibitor cocktail (Roche), as previously described”. After 
centrifugation at 150,000g, the supernatant was subjected to SDS-polyacrylamide 
gel electrophoresis (4-15%) in the presence or absence of 100mM DTT. The 
proteins were transferred to Hybond-ECL nitrocellulose membranes (GE 
Healthcare). The membranes were blocked with TBST (20mM_ Tris-HCl 
(pH 7.4), 150 mM NaCl and 0.1% Tween-20) containing 10% milk, then incubated 
with mouse monoclonal antibodies against GluN1 (MAB 1586, Millipore) or 
GluN2B (Invitrogen), followed by HRP-conjugated anti-mouse antibodies (GE 
Healthcare). Protein bands were detected by ECL detection kit (GE Healthcare). 
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Coseismic and postseismic slip of the 2011 
magnitude-9 Tohoku-Oki earthquake 


Shinzaburo Ozawa’, Takuya Nishimura, Hisashi Suito!, Tomokazu Kobayashi’, Mikio Tobita! & Tetsuro Imakiire! 


Most large earthquakes occur along an oceanic trench, where an 
oceanic plate subducts beneath a continental plate. Massive earth- 
quakes with a moment magnitude, M,,, of nine have been known to 
occur in only a few areas, including Chile, Alaska, Kamchatka and 
Sumatra. No historical records exist of a M,, = 9 earthquake along 
the Japan trench, where the Pacific plate subducts beneath the 
Okhotsk plate, with the possible exception of the AD 869 Jogan 
earthquake’, the magnitude of which has not been well con- 
strained. However, the strain accumulation rate estimated there 
from recent geodetic observations is much higher than the average 
strain rate released in previous interplate earthquakes” °. This find- 
ing raises the question of how such areas release the accumulated 
strain. A megathrust earthquake with M,, = 9.0 (hereafter referred 
to as the Tohoku-Oki earthquake) occurred on 11 March 2011, 
rupturing the plate boundary off the Pacific coast of northeastern 
Japan. Here we report the distributions of the coseismic slip and 
postseismic slip as determined from ground displacement detected 
using a network based on the Global Positioning System. The 
coseismic slip area extends approximately 400km along the 
Japan trench, matching the area of the pre-seismic locked zone‘. 
The afterslip has begun to overlap the coseismic slip area and 
extends into the surrounding region. In particular, the afterslip 
area reached a depth of approximately 100 km, with M,, = 8.3, on 
25 March 2011. Because the Tohoku-Oki earthquake released the 
strain accumulated for several hundred years, the paradox of the 
strain budget imbalance may be partly resolved. This earthquake 
reminds us of the potential for M,, ~ 9 earthquakes to occur along 
other trench systems, even if no past evidence of such events exists. 
Therefore, it is imperative that strain accumulation be monitored 
using a space geodetic technique to assess earthquake potential. 

Northeastern Japan has been struck by many M7-class (My = 7) 
interplate earthquakes along the Japan trench, where the Pacific plate 
subducts beneath the Okhotsk plate at a rate of 73-78 mm yr | (refs 7, 
8; Fig. la). However, no interplate earthquake along the Japan trench 
with a surface-wave magnitude, M, of more than 7.5 has been instru- 
mentally recorded since 1923, except along the northernmost part of 
the trench, where there have been M = 7.9 and M = 7.6 earthquakes 
(Fig. 1b). There are no historical records of any My > 8.5 earthquakes 
occurring along the Japan trench since the seventeenth century. 
Therefore, the massive (M,, = 9) Tohoku-Oki earthquake was not 
widely anticipated despite the geological evidence for recurrent devas- 
tating tsunamis in the past, in particular in AD 869’, and the rapid 
accumulation of elastic strain along the trench. 

On the basis of earthquake catalogues covering several decades’, the 
seismic coupling coefficient, that is, the ratio between the rate of slip 
released in an interplate earthquake and the rate of relative plate 
motion, has been estimated to be 10-20% along the Japan trench”. 
However, ground displacement data acquired using a continuous 
Global Positioning System (GPS) network established in 1994 suggest 
that there is strong interplate coupling along the Japan trench*® 
(Fig. 1b). The strain accumulation rate estimated from contemporary 
deformation is considerably higher than the average strain rate 


released in historical earthquakes. An episodic aseismic slip, including 
an afterslip, has been suggested as a possible mechanism for significant 
elastic strain release*””. 

In this Letter, first we describe the coseismic and postseismic defor- 
mations associated with the Tohoku-Oki earthquake, as detected by 
the GPS Earth Observation Network (GEONET) operated by the 
Geospatial Information Authority of Japan'’, and estimate the coseis- 
mic slip distribution and the subsequent afterslip distribution on the 
plate boundary by geodetic inversion’’ of the ground displacement at 
selected GPS sites. Second, we discuss the relationship between the 
coseismic and the postseismic slip models, as well as their relation to 
pre-seismic coupling and strain budget imbalance. 

The observed coseismic displacements show eastward movements of 
up to 5.3 m and subsidence by up to 1.2 m along the coastal line of the 
Tohoku region, relative to the Fukue site (Figs 1 and 2a). These values are 
greater by one order of magnitude than those recorded at the time of 
previous M7-M8-class interplate earthquakes in the Tohoku region that 
have taken place since the establishment of GEONET. After the 
Tohoku-Oki earthquake, a large postseismic deformation occurred 
(Fig. 2b). Although the postseismic deformation resembles the coseismic 
field, the displacements seem to be more broadly distributed. In particu- 
lar, the eastward displacement of the Pacific coastal area did not differ 
significantly from that of the western coastal area, whereas the eastward 
displacement on the Pacific coast was much larger than that of the 
western coastal area in the coseismic field. In addition, the Pacific coastal 
area near the source region was uplifted after the earthquake. 

The slip distribution estimated on the basis of the coseismic displace- 
ments shows a large slip of up to 27m near the epicentral area, 
extending approximately 400 km along the Japan trench at a depth 
of less than 60 km, which is the lower limit for the seismogenic zone 
along the subducting plate in this region’® (Figs 1b and 2a). The 
estimated moment is 3.43 X 10°” Nm, assuming a uniform rigidity 
of 40 GPa, equivalent to that of an M,, = 9.0 earthquake. A uniform 
rigidity of 40 GPa is a rough average of 29, 41 and 50 GPa for the 
typical rigidities of upper crust, lower crust and upper mantle in 
northeastern Japan based on seismic data’*. The moment of our 
geodetic model closely matches the moment magnitude of 9.1 
inferred from seismic waveform analysis in ref. 15. The root mean 
squared deviation of this model is 0.011 m (Supplementary Figs 1 and 
2), and the estimated slip is well beyond the lo error (Supplementary 
Figs 3). Chequerboard and sensitivity tests show that spatial varia- 
tions of slip over the coseismic area are resolved to the scale of a few 
tens of kilometres and that the principal pattern is a stable feature. 

The estimated afterslip, which is based on the postseismic deforma- 
tion, occurs in the coseismic slip area and adjacent to it, expanding to 
the north, the south and in the dipping direction (Figs 2b and 3). The 
afterslip area has two modal centres: northwest of the centre of the 
coseismic slip and east of the Kanto region. These centres reflect the 
large postseismic displacement along the Pacific coast, which, unlike 
the coseismic displacement, extends north and south (Fig. 2). The root 
mean squared deviation of this model is 0.007 m (Supplementary Figs 
3 and 4). The estimated moment of the afterslip on 25 March is 


Geospatial Information Authority of Japan, Tsukuba, Ibaraki 305-0811, Japan. 
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3.35 X 10°' N m, which is equivalent to that of an My = 8.3 earthquake 
(Figs 2b and 3). This moment is approximately 10% of that of the 
mainshock. We assume that the postseismic deformation transients 
are due solely to afterslip, although they are affected by the viscoelastic 
relaxation of the asthenosphere and poroelastic rebound'®. We estim- 
ate the magnitude of these effects by simple calculation. The visco- 
elastic relaxation model’’ is predicted to be within 1 cm of surface 
displacement for two weeks, with an asthenospheric viscosity of 
10’? Pa. Although poroelastic effects reach 20% of the observed post- 
seismic deformation at maximum, their horizontal and vertical pat- 
terns are quite different from the observations. Thus, we assume that 
these effects can be ignored as a first approximation in this case. 

The area of large afterslip is located in the region peripheral to the 
coseismic slip zone. In addition, aftershocks seem to occur in the 
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Figure 1 | Tectonic setting in and around the Tohoku-Oki earthquake. 

a, Plate configurations of the Japanese islands”. The focal mechanism of the 
Tohoku-Oki earthquake is taken from the Global Centroid-Moment-Tensor 
Project'’. The red arrows indicate relative motion between the two plates at a 
plate boundary’*. b, Coupling distribution before the earthquake and recent 
seismicity along the Japan trench. The colour shading and contours indicate the 
degree of interplate coupling between the subducting Pacific plate and the 
overriding Okhotsk plate, estimated from GPS data recorded between April 
2000 and March 2001*. The degree of coupling is expressed as the backslip 
rate*’, which is a slip deficit from the relative plate velocity. The stars mark the 
epicentres of large (M = 6.8) earthquakes that have occurred since 1923. The 
epicentres of the mainshock, a foreshock and earthquakes with M = 7.4 are 
marked by yellow stars and labelled with their magnitudes and/or times of 
occurrence. The orange area is the source area of the M = 7.6 1994 earthquake”. 
The dashed line shows the northeastern limit of the subducted Philippine Sea 
plate’ (PHS). The Okhotsk plate overrides the Pacific plate north of this limit 
and the Philippine Sea plate overrides the Pacific plate south of this limit. The 
grey rectangle represents a fault patch to estimate the backslip rate. 


afterslip area, avoiding the large coseismic slip area (Fig. 3). This is 
consistent with the observation that in many cases the aftershocks and 
the large afterslip occur in an area where the coseismic slip is not 
large’*"’. The seismic moments of thrust-type aftershocks sum to 
1.5 X 10'?Nm for the period of the postseismic deformation. This 
suggests that aftershocks contribute less than 1% of the moment of 
the afterslip model for two weeks. Although the estimated area of small 
afterslip overlaps the coseismic slip, we cannot rule out the possibility 
that this may be due to oversmoothing in the afterslip estimation, 
because a sensitivity test of the smoothing constraint shows that the 
afterslip area avoids the large coseismic slip area in undersmoothed 
models. The expansion in the dipping direction reaches 80-100 km in 
depth, which is the lower limit for the coupling of the plates in this 
region. 

The propagation of the afterslip into an area deeper than the coseismic 
slip area was observed for the 1994 Sanriku-Haruka-Oki earthquake” 
(M = 7.6) and the 2003 Tokachi-Oki earthquake"* (M = 8.0), suggesting 
a general dipping expansion along the Japan and Kuril trenches. Because 
the interplate coupling rate is near zero at depths of more than 100 km, 
we think that the afterslip area terminates at this limit. 

The northward expansion of the afterslip area approaches the zone 
ruptured in the 1968 (M=7.9) and 1994 (M=7.6) earthquakes, 
(Fig. 3). The afterslip area may terminate there because the source area 
of the 1994 earthquake is now strongly locked. The southward expan- 
sion has reached the Kanto region (Figs 1b, 2b and 3). The Philippine 
Sea plate overrides the Pacific plate in the Kanto region, south of the 
northeastern limit of the former plate, whereas the Okhotsk plate lies 
on the Pacific plate north of this limit*’ (Figs 1 and 3). There is a 
possibility that this change in the overriding plate stops the southward 
expansion of the afterslip at the limit of the Philippine Sea plate in the 
Kanto region. Our model estimates the afterslip distribution at inter- 
vals of 1 d, and the two modal centres of the afterslip area seem not to 
move significantly on this timescale whereas the slip magnitude 
increases rapidly. 

The ruptured area of the Tohoku-Oki earthquake well matches the 
area estimated to have been strongly coupled before the earthquake* 
(Fig. 1b), although the centre of the coseismic slip area is shallower 
than that of the locked area. This finding indicates the extreme import- 
ance of the GPS observations in assessing the potential of a subduction 
earthquake occurring, as has been observed in other subduction 
zones”. A deeper part of the locked zone may release the remaining 
strain energy by further afterslip of the Tohoku-Oki earthquake. 

The moment accumulation rate attributed to the subduction of the 
Pacific plate in an area from latitude 36° N to 39.5° N along the Japan 
trench is estimated to have been approximately 1.6 X 10°°Nmyr ! 
before the earthquake’. The repeated occurrence of M,, < 8 interplate 
earthquakes contributes 10-20% of the plate motion, without taking 
the afterslip into account**. It remains unclear how much energy will 
be released by the afterslip of the Tohoku-Oki earthquake. In the case 
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Figure 2 | Coseismic and postseismic displacements and estimated slip. 

a, Coseismic displacements for 10-11 March 2011, relative to the Fukue site. 
The black arrows indicate the horizontal coseismic movements of the GPS sites. 
The colour shading indicates vertical displacement. The star marks the location 
of the earthquake epicentre. The dotted lines indicate the isodepth contours of 
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Figure 3 | Coseismic slip, postseismic slip and aftershocks. Estimated 
coseismic slip (black contour, 4-m interval) and postseismic slip (red contour, 
0.2-m interval) of the Tohoku-Oki earthquake for the same period as in Fig. 2. 
The green dashed line indicates the northeast limit of the Philippine Sea plate. 
The blue dashed line indicates the ruptured area of the M = 7.6 1994 
earthquake’. The grey circles show the epicentres of the aftershocks of the 
Tohoku-Oki earthquake for 11-25 March 2011. All other markings represent 
the same as in Fig. 2. 


the plate boundary at 20-km intervals”. The solid contours show the coseismic 
slip distribution in metres. b, Postseismic displacements for 12-25 March 2011, 
relative to the Fukue site. The red contours show the afterslip distribution in 
metres. All other markings represent the same as in a. 


of the 1994 Sanriku-Haruka-Oki earthquake, the afterslip released 
moment equivalent to that of the mainshock over the course of 1 yr 
(ref. 10). In other cases, it has been reported that the postseismic slip 
releases a moment of up to ~30% of that of the mainshock over several 
weeks after M8-class earthquakes’. The moment released by afterslip 
after the 2004 Sumatra earthquake is also estimated to have been 30% 
of the moment of the mainshock over 40 d (ref. 24). If we assume that 
the afterslip of the Tohoku-Oki earthquake eventually releases 30- 
100% of the amount of energy released by the mainshock, and extra- 
polate the interplate coupling from the geodetic observations, we find 
that it would take approximately 350-700 yr for energy equivalent to 
that of the earthquake to accumulate along the Japan trench. 

Recent geological studies suggest that tsunamis similar to that which 
followed the Tohoku-Oki earthquake have repeatedly struck the 
Pacific coast of northeastern Japan, with a recurrence interval of 
approximately 800-1,100 yr (ref. 1), implying that megathrust earth- 
quakes have also occurred repeatedly along the Japan trench. The 
massive Tohoku-Oki earthquake supports this hypothesis and may 
partly resolve the strain budget imbalance, although the roughly esti- 
mated recurrence interval is shorter than the tsunami return period. 

The Pacific coastal area subsided by up to 1.2 m following the earth- 
quake. Furthermore, subsidence of the Pacific coast at a rate of 
5-10mmyr ' over the past 100 yr has been estimated by tide gauges”, 
levelling and GPS data. Although geodetic observations indicate both 
coseismic and interseismic subsidence, a geomorphological study has 
shown that there was long-term upheaval along the Pacific coast in the 
late Quaternary period”. This discrepancy suggests the existence of 
another mechanism of episodic uplift, such as postseismic deforma- 
tion”’. In fact, the Pacific coastal area near the epicentre started to uplift 
after the Tohoku-Oki earthquake by an amount ranging from 1 to 
4 cm, as observed over the course of two weeks (Fig. 2). It is difficult 
to predict the future temporal evolution of uplifting from such a short 
observation period. If the uplift lasts for a long time, the discrepancy 
between subsidence and upheaval will be resolved. To understand the 
uplift mechanism, it is important to continue geodetic monitoring. 


METHODS SUMMARY 


Coseismic and postseismic displacements were based on GPS data collected for 6 h 
and were analysed with the BERNESE GPS software. We used the east-west, 
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north-south and up-down components, measured relative to the Fukue site, at 
approximately 400 selected GPS sites covering northeastern Japan (Fig. la). We 
used the Yabuki-Matsu’ura method” to estimate the slip distribution on the plate 
boundary, and used a fault patch that covers an area ~500km in width and 
~800 km in length to represent the plate boundary in this region”®. The fault patch 
was represented by a parametric spline surface. Green’s functions’* that assume a 
homogeneous half-space were used. A detailed description of the GPS inversion 
approach, including resolution and sensitivity analysis, can be found in Methods 
and Supplementary Figs 5-9. The data set and the results of inversion are shown in 
Supplementary Tables 1-4. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


The objective of our study was to determine the distribution of the coseismic and 
postseismic slip of the Tohoku-Oki earthquake, Japan. The analysis was based on 
the modelling of the coseismic and postseismic deformation by a GPS network in 
Japan. A solution was obtained by inverting a set of 377 coseismic slip GPS vectors 
and 357 postseismic slip GPS vectors along the Japan trench. 

GPS data. The GPS data used in this study were derived by the most rapid of the 
three strategies in the baseline analysis of the GEONET routine”. In the strategy 
for this solution, 6-h data are processed to estimate the static coordinates using the 
software BERNESE 5.0 and the ultrarapid ephemerides of the International Global 
Navigation Satellite Systems Service (IGS) every 3h. Tropospheric zenith delays 
and gradients for two and, respectively, a single time segment were estimated using 
the Niell mapping function* in each session. The elevation cut-off angle for the 
GPS satellites was 15°, and absolute phase centre corrections to the GPS antennas 
were applied for each monument type of the GEONET stations. Thus, the GPS 
carrier-phase ambiguities were resolved. The site coordinates were compared with 
the 2005 International Terrestrial Reference Frame** (ITRF2005) by fixing a 
fiducial station (station 92110, close to the Tsukuba IGS station) as the a-priori 
value using ITRF2005. Because of the large coseismic displacement of the Tohoku- 
Oki earthquake, the fiducial station moved eastwards by ~0.5 m. In the baseline 
analysis after the earthquake, the a-priori coordinate of the fiducial station was 
corrected by adding the coseismic offset. The coseismic and postseismic displace- 
ments were calculated from the relative site coordinates with respect to the reference 
site (Fukue; station code 950462), which is ~1,400 km away from the earthquake 
epicentres. The repeatabilities of the relative coordinates were 4.0, 3.6 and 15.1 mm 
for the east-west, north-south and up-down components, respectively, which are 
averages of standard deviations for the 412 sites used during 1-10 February 2011. 
Model strategy. Coseismic offsets were estimated by subtracting the average 
coordinates for the period between 21:00 on 9 March and 09:00 on 11 March 
from the coordinates at 18:00 on 11 March 2011 (Japan Standard Time). 
According to the repeatabilities of the coordinates, the errors in the coseismic 
offsets were approximately 6 and 20 mm in the horizontal and vertical compo- 
nents, respectively. Postseismic data on 25 March were estimated by subtracting 
coordinates at 18:00 on 11 March from those at 18:00 on 25 March. We used the 
same error estimates as those for the coseismic deformation. We used 377 and 357 
GPS sites in inversion for coseismic slip and postseismic slip, respectively. 

We created a parametric spline surface to represent the surface of the subduct- 
ing Pacific plate in the Tohoku region’***. The parametric spline surface consisted 
of 15 knots in the dipping direction and 25 knots along the Japan trench. An 
adopted spline surface covers approximately 500 km in width, 800 km in length 
and 200 km in depth. 

We used an inversion method” with minor modifications. In our inversion 
method, east-west and north-south slip components are represented by a para- 
metric spline surface, as is the case for a fault patch. The vertical slip component is 
estimated using the formula in ref. 12. Green’s functions are calculated using the 
formulation of ref. 12, which assumes a homogeneous isotropic half-space. We 
included a roughness matrix’? as prior information, imposing the condition 
2,Mu = 0 on the inversion equation, where , is a hyperparameter of roughness, 
M is the roughness matrix’* and u represents slip. We also adopted the prior 
information that the slip direction is parallel to the motion of the subducting 
Pacific plate, by imposing the condition 22(u; — uz) = 0 on the inversion equation, 
where u, and wy are slip vectors angled at 45° relative to the direction of plate 
motion and A, is a hyperparameter. These roughness and slip rake constraints 
result in a stable solution. By including the roughness and slip directions, we were 
able to estimate the two hyperparameters by minimizing the Akaike Bayesian 
information criterion'**®. Minimization was done using Powell’s method’’. We 
set to zero the displacements at the edge of the fault surface and the displacements 
east of the Japan trench, as a boundary condition. 

Resolution and sensitivity of the inversion. We tested the resolution power 
by attempting to recover a given coseismic slip distribution, which is often called 
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a chequerboard resolution test. We discretized the fault plane into regular 
chequerboard patterns with patches assigned either 0 or 5 m eastward interplate 
slip for the coseismic case (Supplementary Fig. 5a). A forward model introduces 
the chequerboard pattern as a fault slip condition and then simulates the displace- 
ments at the GPS sites. Gaussian random noises corresponding to GPS measure- 
ment errors were then added to obtain a set of synthetic data, which we 
subsequently inverted. The initial chequerboard pattern used in the test had an 
approximate size of 100 km X 100 km, which dimensions are similar to the wave- 
length of the structures we want to resolve, that is, the coseismic slip. The preferred 
models in the coseismic case recovered the main features of the chequerboard in 
most parts of the model region (Supplementary Fig. 5b). Resolution was generally 
good in the down-dip and onshore regions but was relatively poor near the trench. 
A chequerboard pattern was almost reproduced in the western part of the large 
coseismic slip area. This indicates that our inversion method was able to recover 
the main features of the coseismic slip distribution in the Tohoku-Oki earthquake. 

By using a set of roughness coefficients, ranging from the oversmoothed (Sup- 
plementary Fig. 6a—c) to the undersmoothed (Supplementary Fig. 6e-f), we tested 
the sensitivity of the slip distribution to the roughness coefficient, 2. Because A2, 
which constrains the slip direction, does not significantly affect the slip distri- 
bution, we show the sensitivity only for 2;. Supplementary Fig. 6d shows the 
optimal slip model, as determined using the minimum Akaike Bayesian informa- 
tion criterion. This sensitivity test suggests that the estimated coseismic slip dis- 
tribution characterized by the area of large slip east of Sendai and the ~200-400- 
km-long slip area along the Japan trench is a robust feature that does not depend 
on the chosen roughness of the slip distribution. 

We conducted similar tests for the postseismic slip. Supplementary Fig. 7a 
shows the initial chequerboard pattern, in which we assigned 0.4 and 0 m eastward 
interplate slip to the pink and blue areas, respectively. The resulting model shows 
good recovery of the chequerboard pattern, especially in the large afterslip area of 
Figs 2 and 3, but it suggests relatively poor recovery in the offshore area, which is 
100 km away from land. Supplementary Fig. 8d depicts the optimal slip model. 
This sensitivity test indicates that the centre of the afterslip area is located in the 
down-dip area of the coseismic slip region, which is independent of the roughness 
constraint. However, the undersmoothed model (Supplementary Fig. 8e, f) shows 
that the afterslip concentrates along the down-dip edge of the large coseismic area. 

We also tested a situation in which the afterslip does not overlap the coseismic 
slip area, and checked whether the afterslip distribution assigned in this case was 
reproduced by our inversion. The assumed condition is the same as the chequer- 
board test for the postseismic case, except for a given slip distribution. The results 
show that the assigned slip is well recovered by inversion, although the area of 
small slip is extended into the coseismic area owing to the smoothness constraint 
(Supplementary Fig. 9). Thus, we conclude that the slip centre shifts to a deeper 
portion of the coseismic rupture area in the postseismic period. However, owing to 
the limited resolving power of the geodetic data, it is not clear whether the afterslip 
area overlaps the coseismic slip area. 
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The lessons of Tohoku-Oki 


An exceptional data set documents surface deformation before, during and after the earthquake that struck northeastern 
Japan in March 2011. But models for assessing seismic and tsunami hazard remain inadequate. 


JEAN-PHILIPPE AVOUAC 


arthquake science has entered a new era 
Be the development of space-based 

technologies to measure surface defor- 
mation at the boundaries of tectonic plates and 
large faults. Japan has been at the forefront in 
implementing these technologies, in particu- 
lar with the deployment some 15 years ago of 
a network of continuously recording Global 
Positioning System (GPS) stations known as 
GeoNet. Papers analysing the data associated 
with the devastating Tohoku-Oki earthquake 
of 11 March 2011 are now appearing. The latest 
of these, by Ozawa et al.',is published 
online in Nature today. 


With a moment magnitude (M,) of 41° 


9.0, the Tohoku-Oki earthquake ranks 
among the largest ever recorded. The 
data collected at the GeoNet stations’ 
indicate that it resulted from the sud- 
den slip of a remarkably compact area 
(400 kilometres long by 200 kilometres 
wide) of the plate interface where the 
Pacific plate slides beneath the Okhotsk 
plate, on which northern Japan lies. The 
rupture area (Fig. 1) lies off the coast- 
line of Honshu, Japan’s biggest island, 
and extends east nearly all the way to 
the Japan trench — hence the particu- 
larly devastating tsunami produced by 
the earthquake. 

Other new papers” provide further 
information. A combination’ of GPS 
measurements and underwater acous- 
tic sounding shows that the sea bottom 
in the epicentral area moved seaward 
by as much as 24 metres and was 
uplifted by about 3 metres. Therefore, 
slip along the plate interface at depth 
must have exceeded the 27-metre peak 
slip inferred from the GeoNet data'; 
it may have been even more than 50 
metres, as suggested from the joint 
modelling’ of the GeoNet data and sea- 
bottom pressure records of the tsunami 
waves. For comparison, this is about 
twice the peak slip determined for the 
giant earthquakes in Sumatra in 2004 
(M,, 9.4) and in Chile in 2010 (M,,9.0) 
—and larger than that estimated for the 
biggest earthquake ever recorded’, the 
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M,,9.5 event of 1960, which ruptured more 
than 1,000 km of the plate boundary off the 
coast of southern Chile. 

Over the 15 years preceding the 2011 event, 
the GeoNet data’ had revealed the slow accu- 
mulation of strain across Honshu, with the 
Pacific plate squeezing and dragging down the 
eastern edge of Honshu. We know, however, 
that the coast of Honshu is being uplifted in 
the long term, so a significant fraction of that 
‘interseismic’ strain — strain accumulating 
between earthquakes — must be compen- 
sated by sudden episodes of uplift. The current 
model holds that interseismic strain on the 
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Figure 1 | Location of the Tohoku-Oki earthquake. The 
earthquake, with its epicentre marked by a star, ruptured the plate 
interface along which the Pacific plate slides beneath northern 
Honshu at a rate of 8 centimetres per year. Ozawa and colleagues’ 
analysis’ shows that the rupture area and distribution of slip 
(represented by the black contour lines) roughly coincide with 

a patch of the plate interface that had remained locked over the 
preceding decades® (coloured area east of Sendai). The earthquake 
source was extremely compact and produced very large slip 
at relatively shallow depth (less than 20 km depth), hence the 
devastating tsunami. The other well-locked patch in the north 
coincides with rupture areas of large historical earthquakes (in 
particular, the M,-8.5 Sanriku earthquake of 1896 and the M,-8.2 
Tokachi-Oki earthquake of 1968). 
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upper plate is purely elastic, and is ‘recovered’ 
during seismic rupture of the plate interface, 
so that in the long run the upper plate does not 
deform’. This assumption provides a rationale 
to relate slip on the plate interface to inter- 
seismic strain on the upper plate. Where the 
plate interface is creeping, strain on the upper 
plate is negligible; but where the plate interface 
is locked, the upper plate is compressed and 
dragged down, building up elastic strain until 
it is released when the locked patches slip. 

Several earlier studies adopted this assump- 
tion®”* and found that the measured strain 
across Honshu required a large, locked 
patch off the coast of Sendai (Fig. 1). 
The rupture area of the Tohoku-Oki 
earthquake coincides quite well with 
that area’. A noteworthy discrepancy, 
however, is that the rupture reached 
closer to the Japan trench, where inter- 
seismic models suggested there was 
little locking. The particularly extensive, 
shallow slip seen in the Tohoku-Oki 
event could be due either to high pre- 
seismic stress left over from previous 
ruptures of the plate interface that failed 
to reach the trench, or, as seismological 
investigations suggest’, to specific prop- 
erties of the plate interface. In any case, 
the observed slip requires the shallow 
plate interface to have remained locked, 
at least partially, in the period before the 
earthquake. 

The published interseismic models*”* 
indicated little locking at shallow depth, 
essentially as a result of built-in meth- 
odological assumptions: the shallow 
portion of the plate interface is actually 
not well constrained if only onshore data 
are used’, These models may have been 
misleadingly interpreted as discount- 
ing the possibility of extensive shallow 
slip, which in fact occurred. Therefore, 
in the absence of direct constraints from 
sea-bottom geodesy, it may be preferable 
for models to assume maximum locking 
of the shallow zone of the plate inter- 
face. In fact, the interseismic data do not 
exclude a locked region off the coast of 
Sendai extending all the way to the shal- 
low plate-boundary zone at the trench. 
Such an assumption raises questions for 
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assessing the frequency of Tohoku-Oki-like 
earthquakes. 

The estimated slip along the plate bound- 
ary off the coast of northern Honshu — due 
to earthquakes over the past few centuries — 
falls well short of balancing the slip deficit that 
should have accumulated over that period 
owing to interseismic locking. So it might 
seem that a large earthquake there was overdue. 
Indeed, according to the published interseismic 
models, interseismic strain builds up really fast 
on that boundary: it should take only a few cen- 
turies to accumulate enough strain to generate 
an M,,-9.0 earthquake. Such large events should 
recur even more often iflocking of the shallow 
portion of the plate interface is assumed in the 
modelling of interseismic strain. 

By contrast, on the basis of historical and 
palaeo-tsunami records”, large earthquakes 
would be predicted to return only once every 
1,000 years, or even less frequently. The way 
out of this conundrum is not clear. There is no 
evidence for particularly frequent episodes of 
large aseismic slip in that area, and postseismic 
afterslip, although significant’, is much too 
small to balance the slip budget. So either the 
slip deficit accumulating in the interseismic 
period is overestimated (which might happen 
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if, for example, a fraction of interseismic strain 
is not recoverable), or it is incorrect to assume 
that geodetic rates measured over a decade or 
so are representative of strain build-up over 
periods of centuries to millennia. 

Another paradoxical and possibly related 
observation is that the Tohoku-Oki earthquake 
induced more than 1 metre of systematic 
coastal subsidence, whereas uplift would have 
been expected to balance the subsidence rate 
of 5 millimetres per year during the inter- 
seismic period. The long-term coastal uplift 
requires deformation events that are large and 
frequent enough to compensate for that sub- 
sidence. This might call for a review of both the 
assumption that interseismic deformation of 
the upper plate is purely elastic, and the corol- 
lary that interseismic elastic strain is relaxed 
only by earthquakes that occur along the plate 
interface. 

Finally, the geodetic data acquired both 
before*”* and after’ the Tohoku-Oki earth- 
quake suggest that the plate interface south of 
the rupture area is mostly creeping aseismi- 
cally. There is thus no indication of a major 
zone of strain build-up on that portion of the 
plate boundary that might threaten Tokyo. But 
it is clear that although geodetic networks are 
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invaluable instruments for observing strain 
accumulation and seismic release at plate 
boundaries and major faults, we don't yet 
have an adequate theory to use these data for 
earthquake and tsunami hazard assessment. = 


Jean-Philippe Avouac is in the Division of 
Geological and Planetary Sciences, California 
Institute of Technology, Pasadena, California 
91125, USA. 

e-mail: avouac@gps.caltech.edu 


1. Ozawa, S. et al. Nature doi:10.1038/nature10227 
(2011). 

2. Sato, M. et al. Science doi:10.1126/ 
science.1207401 (2011). 

3. Simons, M. et al. Science doi:10.1126/ 
science.1206731 (2011). 

4. Moreno, M. S., Bolte, J., Klotz, J. & Melnick, 

D. Geophys. Res. Lett. 36, 16310, 
doi:10.1029/2009g1039276 (2009). 

5. Suwa, Y., Miura, S., Hasegawa, A., Sato, T. & 
Tachibana, K. J. Geophys. Res. 111, BO4402, 
doi:10.1029/2004JB003203 (2006). 

6. Savage, J. C. Annu. Rev. Earth Planet. Sci. 11, 11-43 
(1983). 

7. Hashimoto, C., Noda, A., Sagiya, T. & Matsu’ura, M. 
Nature Geosci. 2, 141-144 (2009). 

8. Loveless, J. P. & Meade, B. J. J. Geophys. Res. Solid 
Earth 115, BO2410, doi:10.1029/2008jb006248 
(2010). 

9. Ide, S., Baltay, A. & Beroza, G. C. Science 
doi:10.1126/science.1207020 (2011). 

10.Sawai, Y. et a/. Holocene 18, 517-528 (2008). 


